Re: [CentOS] ZFS on Linux in production?

2013-11-04 Thread Markus Falb

On 24.Okt.2013, at 22:59, John R Pierce wrote:

 On 10/24/2013 1:41 PM, Lists wrote:
 Was wondering if anybody here could weigh in with real-life experience?
 Performance/scalability?
 
 I've only used ZFS on Solaris and FreeBSD.some general observations...

...

 3) NEVER let a zpool fill up above about 70% full, or the performance 
 really goes downhill.

Why is it? It sounds cost intensive, if not ridiculous.
Disk space not to used, forbidden land...
Is the remaining 30% used by some ZFS internals?

-- 
Markus

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-11-04 Thread Les Mikesell
On Mon, Nov 4, 2013 at 12:15 PM, Markus Falb wne...@gmail.com wrote:

 3) NEVER let a zpool fill up above about 70% full, or the performance
 really goes downhill.

 Why is it? It sounds cost intensive, if not ridiculous.
 Disk space not to used, forbidden land...
 Is the remaining 30% used by some ZFS internals?

Probably just simple physics.  If ZFS is smart enough to allocate
space 'near' other parts of the related files/directories/inodes it
will have to do worse when there aren't any good choices and it has to
fragment things into the only remaining spaces and make the disk heads
seek all over the place.   Might not be a big problem on SSD's though.

-- 
   Les Mikesell
 lesmikes...@gmail.com
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-11-04 Thread John R Pierce
On 11/4/2013 10:43 AM, Les Mikesell wrote:
 On Mon, Nov 4, 2013 at 12:15 PM, Markus Falb
 wne...@gmail.com  wrote:
 
 3) NEVER let a zpool fill up above about 70% full, or the performance
 really goes downhill.

 Why is it? It sounds cost intensive, if not ridiculous.
 Disk space not to used, forbidden land...
 Is the remaining 30% used by some ZFS internals?
 Probably just simple physics.  If ZFS is smart enough to allocate
 space 'near' other parts of the related files/directories/inodes it
 will have to do worse when there aren't any good choices and it has to
 fragment things into the only remaining spaces and make the disk heads
 seek all over the place.   Might not be a big problem on SSD's though.

even on 0 seek time SSDs, fragmenting files means more extents to track 
and process in order to read that file.


-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-11-04 Thread Nicolas Thierry-Mieg
On 11/04/2013 08:01 PM, John R Pierce wrote:
 On 11/4/2013 10:43 AM, Les Mikesell wrote:
 On Mon, Nov 4, 2013 at 12:15 PM, Markus Falb
 wne...@gmail.com  wrote:

 3) NEVER let a zpool fill up above about 70% full, or the performance
 really goes downhill.

 Why is it? It sounds cost intensive, if not ridiculous.
 Disk space not to used, forbidden land...
 Is the remaining 30% used by some ZFS internals?
 Probably just simple physics.  If ZFS is smart enough to allocate
 space 'near' other parts of the related files/directories/inodes it
 will have to do worse when there aren't any good choices and it has to
 fragment things into the only remaining spaces and make the disk heads
 seek all over the place.   Might not be a big problem on SSD's though.

 even on 0 seek time SSDs, fragmenting files means more extents to track
 and process in order to read that file.

but why would this be much worse with ZFS than eg ext4?

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-11-04 Thread John R Pierce
On 11/4/2013 3:21 PM, Nicolas Thierry-Mieg wrote:
 but why would this be much worse with ZFS than eg ext4?

because ZFS works considerably differently than extfs... its a 
copy-on-write system to start with.



-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-11-03 Thread Rajagopal Swaminathan
Greetings,

On Fri, Oct 25, 2013 at 3:57 AM, Keith Keller
kkel...@wombat.san-francisco.ca.us wrote:

 I don't have my own, but I have heard of other shops which have had lots
 of success with ZFS on OpenSolaris and their variants.

And I know of a shop which could not recover a huge ZFS on freebsd and
had to opt for something like isilon or something like that due to
unavailability of controller drivers for freebsd.


-- 
Regards,

Rajagopal
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-30 Thread Lists
On 10/25/2013 11:14 AM, Chuck Munro wrote:
 To keep the two servers in sync I use 'lsyncd' which is essentially a
 front-end for rsync that cuts down thrashing and overhead dramatically
 by excluding the full filesystem scan and using inotify to figure out
 what to sync.  This allows almost-real-time syncing of the backup
 machine.  (BTW, you need to crank the resources for inotify way up
 for large filesystems with a couple million files.)

Playing with lsyncd now, thanks for the tip!

One qeustion though: why did you opt to use lsyncd rather than using ZFS 
snapshots/send/receive?

Thanks,

Ben
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-30 Thread Mailing List
To be honest is not easier to install on server FreeBSD or Solaris where ZFS is 
natively supported? I moved my own server to FreeBSD and I didn't noticed huge 
difference between Linux distros and freebsd, I have no idea what about Solaris 
but it might be still similar environment. 

Sent from my iPhone

 On 30 Oct 2013, at 10:15 pm, Lists li...@benjamindsmith.com wrote:
 
 On 10/25/2013 11:14 AM, Chuck Munro wrote:
 To keep the two servers in sync I use 'lsyncd' which is essentially a
 front-end for rsync that cuts down thrashing and overhead dramatically
 by excluding the full filesystem scan and using inotify to figure out
 what to sync.  This allows almost-real-time syncing of the backup
 machine.  (BTW, you need to crank the resources for inotify way up
 for large filesystems with a couple million files.)
 
 Playing with lsyncd now, thanks for the tip!
 
 One qeustion though: why did you opt to use lsyncd rather than using ZFS 
 snapshots/send/receive?
 
 Thanks,
 
 Ben
 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos
 
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-26 Thread Ray Van Dolson
On Thu, Oct 24, 2013 at 01:41:17PM -0700, Lists wrote:
 We are a CentOS shop, and have the lucky, fortunate problem of having 
 ever-increasing amounts of data to manage. EXT3/4 becomes tough to 
 manage when you start climbing, especially when you have to upgrade, so 
 we're contemplating switching to ZFS.
 
 As of last spring, it appears that ZFS On Linux http://zfsonlinux.org/ 
 calls itself production ready despite a version number of 0.6.2, and 
 being acknowledged as unstable on 32 bit systems.
 
 However, given the need to do backups, zfs send sounds like a godsend 
 over rsync which is running into scaling problems of its own. (EG: 
 Nightly backups are being threatened by the possibility of taking over 
 24 hours per backup)
 
 Was wondering if anybody here could weigh in with real-life experience? 
 Performance/scalability?
 
 -Ben
 
 PS: I joined their mailing list recently, will be watching there as 
 well. We will, of course, be testing for a while before making the 
 switch.

Joining the discussion late, and don't really have anything to
contribute on the ZFSonLinux side of things...

At $DAYJOB we have been running ZFS via Nexenta (previously via Solaris
10) for many years.  We have about 5PB of this and the primary use case
is for backups and handling of imagery.

For the most part, we really, really like ZFS.  My feeling is that ZFS
itself (at least in the *Solaris form) is rock solid and stable.  Other
pieces of the stack -- namely SMB/CIFS and some of the management tools
provided by the various vendors are a bit more questionable.  We spend
a bit more time fighting weirdnesses with things higher up the stack
than we do say on our NetApp environment.  Too be expected.

I'm waiting for Red Hat or someone else to come out and support ZFS --
perhaps unlikely due to legality questions, but if I could marry the
power of ZFS with the software stack in Linux (Samba!!), I'd be mighty
happy.  Yes -- we could run Samba on our Nexenta boxes, but it isn't
supported.

Echo'ing what many others say:

- ZFS is memory hungry.  All of our PRD boxes have 144GB of memory in
  them, and some have SSD's for ZIL or L2ARC depending on the workload.
- Powerful redundancy is possible.  Our environment is built on top of
  Dell MD1200 JBOD's all dual pathed up to dual LSI SAS switches.  Our
  vdev's (RAID groups) are sized to match the number of JBODs with the
  invididual disks spread across each JBOD.  We use triple parity RAID
  (RAIDZ3) and as such can lose three entire JBODs without suffering
  any data loss.  We actually had one JBOD go flaky on us and were able
  to hot yank it out, put in a new one with zero downtime (and much
  shorter resilver/rebuild times than you'd get with regular RAID).
- We make heavy use of snapshots and clones.  Probably have 200-300 on
  some sysems and we use them to do release management for collections
  of imagery.  Very powerful and haven't run into performance issues
  yet.
  * Snapshots let us take diffs between versions quite easily.  We
then stream these diffs to an identical ZFS system at a DR site and
merge in the changes.  Our network pipe isn't big enough yet to do
this quickly, so we typically just plug in another SAS JBOD with a
zpool on it, stream the diffs there as a flat file, sneakernet the
JBOD to the DR site, plug it in, import the zpool and slurp in the
differences.  Pretty cool.

As I mentioned, we have run into a few weird quirks.  Mainly around
stability of the management GUI (or lack of basic features like
useful SNMP based monitoring), performance with CIFS and oddnesses
like high system load in certain edge cases.  Some general rough edges
I suppose that we've been OK dealing with.  The Nexenta guys are super
smart, but of course they're a smaller shop and don't have the
resources behind them that CentOS does with Red Hat.

My guess is that this would be exacerbated to some extent on the Linux
platform at this point.  I personally wouldn't want to use ZFS on Linux
for our customer data serving workloads, but might consider it for
something purely internal.

Ray
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-26 Thread Ray Van Dolson
On Thu, Oct 24, 2013 at 01:59:15PM -0700, John R Pierce wrote:
 On 10/24/2013 1:41 PM, Lists wrote:
  Was wondering if anybody here could weigh in with real-life experience?
  Performance/scalability?
 
 I've only used ZFS on Solaris and FreeBSD.some general observations...
 
 1) you need a LOT of ram for decent performance on large zpools. 1GB ram 
 above your basic system/application requirements per terabyte of zpool 
 is not unreasonable.
 
 2) don't go overboard with snapshots.   a few 100 are probably OK, but 
 1000s (*) will really drag down the performance of operations that 
 enumerate file systems.
 
 3) NEVER let a zpool fill up above about 70% full, or the performance 
 really goes downhill.

Have run into this one (again -- with Nexenta) as well.  It can be
pretty dramatic.  We tend to set quotas to ensure we don'get exceed 75%
or so max, but

...at least on the Solaris side, there's a tunable you can set that
keeps the metaslab (which gets fragmented and inefficient when pool
utilization is high) entirely in memory.  This completely resolves our
throughput issue, but does require that you have sufficient memory to
load the thing...

  echo metaslab_debug/W 1 | mdb -kw

There may be a ZOL equivalent.

 4) I prefer using striped mirrors (aka raid10) over raidz/z2, but my 
 applications are primarily database.
 
 (*) ran into a guy who had 100s of zfs 'file systems' (mount points), 
 per user home directories, and was doing nightly snapshots going back 
 several years, and his zfs commands were taking a long long time to do 
 anything, and he couldn't figure out why.  I think he had over 10,000 
 filesystems * snapshots.

Ray
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-26 Thread George Kontostanos
On Sat, Oct 26, 2013 at 4:36 PM, Ray Van Dolson ra...@bludgeon.org wrote:

 On Thu, Oct 24, 2013 at 01:59:15PM -0700, John R Pierce wrote:
  On 10/24/2013 1:41 PM, Lists wrote:
   Was wondering if anybody here could weigh in with real-life experience?
   Performance/scalability?
 
  I've only used ZFS on Solaris and FreeBSD.some general
 observations...
 
  1) you need a LOT of ram for decent performance on large zpools. 1GB ram
  above your basic system/application requirements per terabyte of zpool
  is not unreasonable.
 
  2) don't go overboard with snapshots.   a few 100 are probably OK, but
  1000s (*) will really drag down the performance of operations that
  enumerate file systems.
 
  3) NEVER let a zpool fill up above about 70% full, or the performance
  really goes downhill.

 Have run into this one (again -- with Nexenta) as well.  It can be
 pretty dramatic.  We tend to set quotas to ensure we don'get exceed 75%
 or so max, but


 We maybe getting a bit off topic here but on that subject we have noticed
a significant degrade in performance on systems running at 75-80 % of their
pool capacity. I understand that the nature of COW will increase
fragmentation. On large storages though 70% out of 100TB means that you
have to always maintain 30TB free which is not a small number in terms of
cost per TB.

-
George Kontostanos
---
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Warren Young
On Oct 24, 2013, at 8:01 PM, Lists li...@benjamindsmith.com wrote:

 Not sure enough of the vernacular

Yes, ZFS is complicated enough to have a specialized vocabulary.

I used two of these terms in my previous post:

- vdev, which is a virtual device, something like a software RAID.  It is one 
or more disks, configured together, typically with some form of redundancy.

- pool, which is one or more vdevs, which has a capacity equal to all of its 
vdevs added together.

 but lets say you have 4 drives in a 
 RAID 1 configuration, 1 set of TB drives and another set of 2 TB drives.
 
 A1 - A2 = 2x 1TB drives, 1 TB redundant storage.
 B1 - B2 = 2x 2TB drives, 2 TB redundant storage.
 
 We have 3 TB of available storage.

Well, maybe.

You would have 3 TB *if* you configured these disks as two separate vdevs.

If you tossed all four disks into a single vdev, you could have only 2 TB 
because the smallest disk in a vdev limits the total capacity.

(This is yet another way ZFS isn't like a Drobo[*], despite the fact that a lot 
of people hype it as if it were the same thing.)

 Are you suggesting we add a couple of 
 4 TB drives:
 
 A1 - A2 = 2x 1TB drives, 1 TB redundant storage.
 B1 - B2 = 2x 2TB drives, 2 TB redundant storage.
 C1 - C2 = 2x 4TB drives, 4 TB redundant storage.
 
 Then wait until ZFS moves A1/A2 over to C1/C2 before removing A1/A2? If 
 so, that's capability I'm looking for.

No.  ZFS doesn't let you remove a vdev from a pool once it's been added, 
without destroying the pool.

The supported method is to add disks C1 and C2 to the *A* vdev, then tell ZFS 
that C1 replaces A1, and C2 replaces A2.  The filesystem will then proceed to 
migrate the blocks in that vdev from the A disks to the C disks. (I don't 
remember if ZFS can actually do both in parallel.)

Hours later, when that replacement operation completes, you can kick disks A1 
and A2 out of the vdev, then physically remove them from the machine at your 
leisure.  Finally, you tell ZFS to expand the vdev.

(There's an auto-expand flag you can set, so that last step can happen 
automatically.)

If you're not seeing the distinction, it is that there never were 3 vdevs at 
any point during this upgrade.  The two C disks are in the A vdev, which never 
went away.

 But, XFS and ext4 can do that, too.  ZFS only wins when you want to add
 space by adding vdevs.
 
 The only way I'm aware of ext4 doing this is with resizee2fs, which is 
 extending a partition on a block device. The only way to do that with 
 multiple disks is to use a virtual block device like LVM/LVM2 which (as 
 I've stated before) I'm hesitant to do.

Yes, implicit in my comments was that you were using XFS or ext4 with some sort 
of RAID (Linux md RAID or hardware) and Linux's LVM2.   

You can use XFS and ext4 without RAID and LVM, but if you're going to compare 
to ZFS, you can't fairly ignore these features just because it makes ZFS look 
better.

 btrfs didn't have any sort of fsck

Neither does ZFS.

btrfs doesn't need an fsck for pretty much the same reason ZFS doesn't.  Both 
filesystems effectively keep themselves fsck'd all the time, and you can do an 
online scrub if you're ever feeling paranoid.

ZFS is nicer in this regard, in that it lets you schedule the scrub operation.  
You can obviously schedule one for btrfs, but that doesn't take into account 
scrub time.  If you tell ZFS to scrub every day, there will be 24 hours of gap 
between scrubs.

We use 1 week at the office, and each scrub takes about a day, so the scrub 
date rotates around the calendar by about a day per week.

ZFS also has better checksumming than btrfs: up to 256 bits, vs 32 in btrfs.  
(1 in 4 billion odds of irrecoverable data per block is still pretty good, 
though.)

 There was one released a while 
 back that had some severe limitations. This has made me wary.

All of the ZFSes out there are crippled relative to what's shipping in Solaris 
now, because Oracle has stopped releasing code.  There are nontrivial features 
in zpool v29+, which simply aren't in the free forks of older versions o the 
Sun code.

Some of the still-active forks are of even older versions.  I'm aware of one 
popular ZFS implementation still based on zpool *v8*.

If all you're doing is looking at feature sets, you can find reasons to reject 
every single option.

 There are dkml RPMs on the website. 
 http://zfsonlinux.org/epel.html

It is *possible* that keeping the CDDL ZFS code in a separate module manages to 
avoid tainting the GPL kernel code, in the same way that some people talk 
themselves into allowing proprietary GPU drivers with DRM support into their 
kernels.

You're playing with fire here.  Bring good gloves.



[*] or other hybrid RAID system; I don't mean to suggest that only Drobo can do 
this
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread John R Pierce
On 10/24/2013 11:18 PM, Warren Young wrote:
 All of the ZFSes out there are crippled relative to what's shipping in 
 Solaris now, because Oracle has stopped releasing code.  There are nontrivial 
 features in zpool v29+, which simply aren't in the free forks of older 
 versions o the Sun code.

openZFS is doing pretty well on the BSD/etc side of things.   some of 
the original developers of ZFS, who long since bailed on Oracle, are 
contributing code thats not in the Oracle branch, they forked in 2010, 
with the last release from Sun, when OpenSolaris was discontinued.  The 
current version of OpenZFS no longer relies on 'version numbers', 
instead it has 'feature flags' for all post v28 features.  The version 
in my BSD 9.1-stable system has feature-flags for...

async_destroy (read-only compatible)
  Destroy filesystems asynchronously.
empty_bpobj   (read-only compatible)
  Snapshots use less space.
lz4_compress
  LZ4 compression algorithm support.



-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Lists
On 10/24/2013 11:18 PM, Warren Young wrote:
 - vdev, which is a virtual device, something like a software RAID.  It is one 
 or more disks, configured together, typically with some form of redundancy.

 - pool, which is one or more vdevs, which has a capacity equal to all of its 
 vdevs added together.

Thanks for the clarification of terms.

 You would have 3 TB *if* you configured these disks as two separate vdevs.

 If you tossed all four disks into a single vdev, you could have only 2 TB 
 because the smallest disk in a vdev limits the total capacity.

 (This is yet another way ZFS isn't like a Drobo[*], despite the fact that a 
 lot of people hype it as if it were the same thing.)

Two separate vdevs is pretty much what I was after. Drobo: another 
interesting option :)

 Are you suggesting we add a couple of
 4 TB drives:

 A1 - A2 = 2x 1TB drives, 1 TB redundant storage.
 B1 - B2 = 2x 2TB drives, 2 TB redundant storage.
 C1 - C2 = 2x 4TB drives, 4 TB redundant storage.

 Then wait until ZFS moves A1/A2 over to C1/C2 before removing A1/A2? If
 so, that's capability I'm looking for.
 No.  ZFS doesn't let you remove a vdev from a pool once it's been added, 
 without destroying the pool.

 The supported method is to add disks C1 and C2 to the *A* vdev, then tell ZFS 
 that C1 replaces A1, and C2 replaces A2.  The filesystem will then proceed to 
 migrate the blocks in that vdev from the A disks to the C disks. (I don't 
 remember if ZFS can actually do both in parallel.)

 Hours later, when that replacement operation completes, you can kick disks A1 
 and A2 out of the vdev, then physically remove them from the machine at your 
 leisure.  Finally, you tell ZFS to expand the vdev.

 (There's an auto-expand flag you can set, so that last step can happen 
 automatically.)

 If you're not seeing the distinction, it is that there never were 3 vdevs at 
 any point during this upgrade.  The two C disks are in the A vdev, which 
 never went away.

I see the distinction about vdevs vs. block devices. Still, the process 
you outline is *exactly* the capability that I'm looking for, despite 
the distinction in semantics.

 Yes, implicit in my comments was that you were using XFS or ext4 with some 
 sort of RAID (Linux md RAID or hardware) and Linux's LVM2.

 You can use XFS and ext4 without RAID and LVM, but if you're going to compare 
 to ZFS, you can't fairly ignore these features just because it makes ZFS look 
 better.

I've had good results with Linux' software RAID+Ext[2-4].  For example, 
I *love* that you can mount a RAID partitioned drive directly in a 
worst-case scenario. LVM2 complicates administration terribly. The 
widely touted, simplified administration of ZFS is quite attractive to me.

I'm just trying to find the best tool for the job. That may well end up 
being Drobo!

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread John R Pierce
On 10/25/2013 10:33 AM, Lists wrote:
 LVM2 complicates administration terribly.

huh?  it hugely simplifies it for me, when I have lots of drives. I just 
wish mdraid and lvm were better integrated.  to see how it should have 
been done, see IBM AIX's version of lvm.you grow a jfs file system, 
it automatically grows the underlying LV (logical volume), online, 
live.   mirroring in AIX is done via lvm.







-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Chuck Munro

On 10/25/2013, 05:00 , centos-requ...@centos.org wrote:
 We are a CentOS shop, and have the lucky, fortunate problem of having
 ever-increasing amounts of data to manage. EXT3/4 becomes tough to
 manage when you start climbing, especially when you have to upgrade, so
 we're contemplating switching to ZFS.

 As of last spring, it appears that ZFS On Linuxhttp://zfsonlinux.org/
 calls itself production ready despite a version number of 0.6.2, and
 being acknowledged as unstable on 32 bit systems.

 However, given the need to do backups, zfs send sounds like a godsend
 over rsync which is running into scaling problems of its own. (EG:
 Nightly backups are being threatened by the possibility of taking over
 24 hours per backup)

 Was wondering if anybody here could weigh in with real-life experience?
 Performance/scalability?

 -Ben

FWIW, I manage a small IT shop with a redundant pair of ZFS file servers 
running the  zfsonlinux.org  package on 64-bit ScientificLinux-6 
platforms.  CentOS-6 would work just as well.  Installing it with yum 
couldn't be simpler, but configuring it takes a bit of reading and 
experimentation.  I reserved a bit more than 1GByte of RAM for each 
TByte of disk.

One machine (20 useable TBytes in raid-z3) is the SMB server for all of 
the clients, and the other machine (identically configured) sits in the 
background acting as a hot spare.  Users tell me that performance is 
quite good.

After about 2 months of testing, there have been no problems whatsoever, 
although I'll admit the servers do not operate under much stress.  There 
is a cron job on each machine that does a scrub every Sunday.

The old ext4 primary file servers have been shut down and the ZFS boxes 
put into production, although one of the old ext4 servers will remain 
rsync'd to the new machines for a few more months (just in case).

The new servers have the zfsonlinux repositories configured for manual 
updates, but the two machines tend to be left alone unless there are 
important security updates or new features I need.

To keep the two servers in sync I use 'lsyncd' which is essentially a 
front-end for rsync that cuts down thrashing and overhead dramatically 
by excluding the full filesystem scan and using inotify to figure out 
what to sync.  This allows almost-real-time syncing of the backup 
machine.  (BTW, you need to crank the resources for inotify way up 
for large filesystems with a couple million files.)

So far, so good.  I still have a *lot* to learn about ZFS and its 
feature set, but for now it's doing the job very nicely.  I don't miss 
the long ext4 periodic fsck's one bit  :-)

YMMV, of course,
Chuck
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Mauricio Tavares
On Fri, Oct 25, 2013 at 1:40 PM, John R Pierce pie...@hogranch.com wrote:
 On 10/25/2013 10:33 AM, Lists wrote:
 LVM2 complicates administration terribly.

 huh?  it hugely simplifies it for me, when I have lots of drives. I just
 wish mdraid and lvm were better integrated.  to see how it should have
 been done, see IBM AIX's version of lvm.you grow a jfs file system,
 it automatically grows the underlying LV (logical volume), online,
 live.   mirroring in AIX is done via lvm.

  Funny you mentioned AIX's JFS. I really like what they did there.

 --
 john r pierce  37N 122W
 somewhere on the middle of the left coast

 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Warren Young
On 10/25/2013 00:44, John R Pierce wrote:
 current version of OpenZFS no longer relies on 'version numbers',
 instead it has 'feature flags' for all post v28 features.

This must be the zpool v5000 thing I saw while researching my previous 
post.  Apparently ZFSonLinux is doing the same thing, or is perhaps also 
based on OpenZFS.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Warren Young
On 10/25/2013 11:33, Lists wrote:

 I'm just trying to find the best tool for the job.

Try everything.  Seriously.

You won't know what you like, and what works *for you* until you have 
some experience.  Buy a Drobo for the home, replace one of your old file 
servers with a FreeBSD ZFS box, enable LVM on the next Linux workstation.

 That may well end up being Drobo!

Drobos are no panacea, either.

Years ago, my Drobo FS would disappear from the network occasionally, 
and have to be rebooted.  (This seems to be fixed now.)

My boss's first-generation Drobo killed itself in a power outage.  It 
was directly attached to his Windows box, and on restart, chkdsk 
couldn't find a filesystem at all.  A data recovery program was able to 
pull files back off the disk, though, so it's not like the unit was 
actually dead.  It just managed to corrupt the NTFS data structures 
thoroughly, despite the fact that it's supposed to be a redundant 
filesystem.  It implies Drobo isn't using a battery-backed RAM cache, 
for their low-end units at least.

Every Drobo I've ever used[*] has been much slower than a 
comparably-priced dumb RAID.

The first Drobos would benchmark at about 20 MByte/sec when populated by 
disks capable of 100 MByte/sec raw.  The two subsequent Drobo 
generations were touted as faster, but I don't think I ever hit even 30 
MByte/sec.

Data migration after replacing a disk is also uncomfortably slow.  The 
fastest I've ever seen a disk replacement take is about a day.  As disks 
have gotten bigger, my existing Drobos haven't gotten faster, so now 
migration might take a week!  It's for this single reason that I now 
refuse to use single-disk redundancy with Drobos.  The window without 
protection is just too big now.

A lot of this is doubtless down to the small embedded processor in these 
things.  ZFS on a real computer is simply in a different class.


[*] I haven't yet used a Thunderbolt or B series professional version. 
  It is possible they're running at native disk speeds.  But then, 
they're even more expensive.

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread John R Pierce
On 10/25/2013 1:26 PM, Warren Young wrote:
 On 10/25/2013 00:44, John R Pierce wrote:
 current version of OpenZFS no longer relies on 'version numbers',
 instead it has 'feature flags' for all post v28 features.
 This must be the zpool v5000 thing I saw while researching my previous
 post.  Apparently ZFSonLinux is doing the same thing, or is perhaps also
 based on OpenZFS.


indeed, it is OpenZFS



-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Warren Young
On re-reading, I realized I didn't complete some of my thoughts:

On 10/25/2013 00:18, Warren Young wrote:
 ZFS is nicer in this regard, in that it lets you schedule the scrub
 operation.  You can obviously schedule one for btrfs,

...with cron...

 but that doesn't take into account scrub time.

This is important because a ZFS scrub takes absolute lowest priority. 
(Presumably true for btrfs, too.)  Any time the filesystem has to 
service an I/O request, the scrub stops, then resumes when the I/O 
request has completed, unless another has arrived in the meantime.

This means that you cannot know how long a scrub will take unless you 
can exactly predict your future disk I/O.  Scheduling a scrub with cron 
could land you in a situation where the previous scrub is still running 
due to unusually high I/O when another scrub request comes in.

I initially set our ZFS file server up so that it would start scrubbing 
at close of business on Friday, but due to the way ZFS scrub scheduling 
works, the most recent scrub started late Wednesday and ran into 
Thursday.  This isn't a problem.  The scrub doesn't run in parallel to 
normal I/O, I don't even notice that the array is scrubbing itself 
unless I go over and watchen das blinkenlights astaunished.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-25 Thread Peter
On 10/26/2013 06:40 AM, John R Pierce wrote:
 
 to see how it should have 
 been done, see IBM AIX's version of lvm.you grow a jfs file system, 
 it automatically grows the underlying LV (logical volume), online, 
 live.

lvm can do this with the --resizefs flag for lvextend, one command to
grow both the logical volume and the fs, and it can be done live
provided the fs supports it.


Peter
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread John R Pierce
On 10/24/2013 1:41 PM, Lists wrote:
 Was wondering if anybody here could weigh in with real-life experience?
 Performance/scalability?

I've only used ZFS on Solaris and FreeBSD.some general observations...

1) you need a LOT of ram for decent performance on large zpools. 1GB ram 
above your basic system/application requirements per terabyte of zpool 
is not unreasonable.

2) don't go overboard with snapshots.   a few 100 are probably OK, but 
1000s (*) will really drag down the performance of operations that 
enumerate file systems.

3) NEVER let a zpool fill up above about 70% full, or the performance 
really goes downhill.

4) I prefer using striped mirrors (aka raid10) over raidz/z2, but my 
applications are primarily database.

(*) ran into a guy who had 100s of zfs 'file systems' (mount points), 
per user home directories, and was doing nightly snapshots going back 
several years, and his zfs commands were taking a long long time to do 
anything, and he couldn't figure out why.  I think he had over 10,000 
filesystems * snapshots.


-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread SilverTip257
On Thu, Oct 24, 2013 at 4:41 PM, Lists li...@benjamindsmith.com wrote:

 We are a CentOS shop, and have the lucky, fortunate problem of having
 ever-increasing amounts of data to manage. EXT3/4 becomes tough to
 manage when you start climbing, especially when you have to upgrade, so
 we're contemplating switching to ZFS.


You didn't mention XFS.
Just curious if you considered it or not.



 As of last spring, it appears that ZFS On Linux http://zfsonlinux.org/
 calls itself production ready despite a version number of 0.6.2, and
 being acknowledged as unstable on 32 bit systems.

 However, given the need to do backups, zfs send sounds like a godsend
 over rsync which is running into scaling problems of its own. (EG:
 Nightly backups are being threatened by the possibility of taking over
 24 hours per backup)

 Was wondering if anybody here could weigh in with real-life experience?
 Performance/scalability?

 -Ben

 PS: I joined their mailing list recently, will be watching there as
 well. We will, of course, be testing for a while before making the
 switch.

 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos




-- 
---~~.~~---
Mike
//  SilverTip257  //
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Lists
On 10/24/2013 01:59 PM, John R Pierce wrote:


 1) you need a LOT of ram for decent performance on large zpools. 1GB ram
 above your basic system/application requirements per terabyte of zpool
 is not unreasonable.

That seems quite reasonable to me. Our existing equipment has far more 
than enough RAM to make this a comfortable experience.

 2) don't go overboard with snapshots.   a few 100 are probably OK, but
 1000s (*) will really drag down the performance of operations that
 enumerate file systems.

Our intended use for snapshots is to enable consistent backup points, 
something we're simulating now with rsync and its hard-link option. We 
haven't figured out the best way to do this, but in our backup clusters 
we have rarely more than 100 save points at any one time.

 3) NEVER let a zpool fill up above about 70% full, or the performance
 really goes downhill.

Thanks for the tip!

 (*) ran into a guy who had 100s of zfs 'file systems' (mount points),
 per user home directories, and was doing nightly snapshots going back
 several years, and his zfs commands were taking a long long time to do
 anything, and he couldn't figure out why.  I think he had over 10,000
 filesystems * snapshots.

Wow. Couldn't he have the same results by putting all the home 
directories on a single ZFS partition?
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Keith Keller
On 2013-10-24, SilverTip257 silvertip...@gmail.com wrote:
 On Thu, Oct 24, 2013 at 4:41 PM, Lists li...@benjamindsmith.com wrote:

 We are a CentOS shop, and have the lucky, fortunate problem of having
 ever-increasing amounts of data to manage. EXT3/4 becomes tough to
 manage when you start climbing, especially when you have to upgrade, so
 we're contemplating switching to ZFS.

 You didn't mention XFS.
 Just curious if you considered it or not.

XFS is better than ext3/4 for many applications, but it's still not as
powerful as ZFS, which basically combines RAID, filesystem, and LVM into
one.  It sounds like the OP is really looking to take advantage of the
extra features of ZFS.

 Was wondering if anybody here could weigh in with real-life experience?

I don't have my own, but I have heard of other shops which have had lots
of success with ZFS on OpenSolaris and their variants.  I know of some
places which are starting to put ZFS on linux into testing or
preproduction, but nothing really extensive yet.

--keith

-- 
kkel...@wombat.san-francisco.ca.us


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread John R Pierce
On 10/24/2013 2:59 PM, Lists wrote:
 (*) ran into a guy who had 100s of zfs 'file systems' (mount points),
 per user home directories, and was doing nightly snapshots going back
 several years, and his zfs commands were taking a long long time to do
 anything, and he couldn't figure out why.  I think he had over 10,000
 filesystems * snapshots.
 Wow. Couldn't he have the same results by putting all the home
 directories on a single ZFS partition?

I believe he wanted quotas per user.   ZFS quotas were only implemented 
at the file system level, at least as of whatever version he was running 
(I don't know if thats changed, as I never mess with quotas).



-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Lists
On 10/24/2013 02:47 PM, SilverTip257 wrote:
 You didn't mention XFS.
 Just curious if you considered it or not.

Most definitely. There are a few features that I'm looking for:

1) MOST IMPORTANT: STABLE!

2) The ability to make the partition  bigger by adding drives with very 
minimal/no downtime.

3) The ability to remove an older, (smaller) drive or drives in order to 
replace with larger capacity drives without downtime or having to copy 
over all the files manually.

4) The ability to create snapshots with no downtime.

5) The ability to synchronize snapshots quickly and without having to 
scan every single file. (backups)

6) Reasonable failure mode. Things *do* go south sometimes. Simple is 
better, especially when it's simpler for the (typically highly stressed) 
administrator.

7) Big. Basically all filesystems in question can handle our size 
requirements. We might hit a 100 TB  partition in the next 5 years.

I think ZFS and BTRFS are the only candidates that claim to do all the 
above. Btrfs seems to have been stable in a year or so for as long as 
I could keep a straight face around the word Gigabyte, so it's a 
non-starter at this point.

LVM2/Ext4 can do much of the above. However, horror stories abound, 
particularly around very large volumes. Also, LVM2 can be terrible in 
failure situations.

XFS does snapshots, but don't you have to freeze the volume first? 
Xfsrestore looks interesting for backups, though I don't know if there's 
a consistent freeze point. (what about ongoing writes?) Not sure about 
removing HDDs in a volume with XFS.

Not as sure about ZFS' stability on Linux (those who run direct Unix 
derivatives seem to rave about it) and failure modes.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Rainer Duffner

Am 25.10.2013 um 00:47 schrieb John R Pierce pie...@hogranch.com:

 On 10/24/2013 2:59 PM, Lists wrote:
 (*) ran into a guy who had 100s of zfs 'file systems' (mount points),
 per user home directories, and was doing nightly snapshots going back
 several years, and his zfs commands were taking a long long time to do
 anything, and he couldn't figure out why.  I think he had over 10,000
 filesystems * snapshots.
 Wow. Couldn't he have the same results by putting all the home
 directories on a single ZFS partition?
 
 I believe he wanted quotas per user.   ZFS quotas were only implemented 
 at the file system level, at least as of whatever version he was running 
 (I don't know if thats changed, as I never mess with quotas).
 
 


User and group quotas have been possible for some time.

ZFS is cool. But there are a lot of issues and stuff that needs to be tuned but 
is difficult to find out if it needs to be tuned.


Especially, if you run into performance-problems.

Once you have some experience with it, I recommend reading this blog:
http://nex7.blogspot.ch

and of course, the FreeNAS forum, where you can read about stuff like that:

https://bugs.freenas.org/issues/1531

On the surface, ZFS is great. But god help you if you run into problems.


___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread George Kontostanos
We tested ZFS on CentOS 6.4 a few months ago using a descend Supermicro
server with 16GB RAM and 11 drives on RaidZ3. Same specs as a middle range
storage server that we build mainly using FreeBSD.

Performance was not bad but eventually we run into a situation were we
could not import a pool anymore after a kernel / modules update.

I would not recommend it for production...


On Fri, Oct 25, 2013 at 2:12 AM, Lists li...@benjamindsmith.com wrote:

 On 10/24/2013 02:47 PM, SilverTip257 wrote:
  You didn't mention XFS.
  Just curious if you considered it or not.

 Most definitely. There are a few features that I'm looking for:

 1) MOST IMPORTANT: STABLE!

 2) The ability to make the partition  bigger by adding drives with very
 minimal/no downtime.

 3) The ability to remove an older, (smaller) drive or drives in order to
 replace with larger capacity drives without downtime or having to copy
 over all the files manually.

 4) The ability to create snapshots with no downtime.

 5) The ability to synchronize snapshots quickly and without having to
 scan every single file. (backups)

 6) Reasonable failure mode. Things *do* go south sometimes. Simple is
 better, especially when it's simpler for the (typically highly stressed)
 administrator.

 7) Big. Basically all filesystems in question can handle our size
 requirements. We might hit a 100 TB  partition in the next 5 years.

 I think ZFS and BTRFS are the only candidates that claim to do all the
 above. Btrfs seems to have been stable in a year or so for as long as
 I could keep a straight face around the word Gigabyte, so it's a
 non-starter at this point.

 LVM2/Ext4 can do much of the above. However, horror stories abound,
 particularly around very large volumes. Also, LVM2 can be terrible in
 failure situations.

 XFS does snapshots, but don't you have to freeze the volume first?
 Xfsrestore looks interesting for backups, though I don't know if there's
 a consistent freeze point. (what about ongoing writes?) Not sure about
 removing HDDs in a volume with XFS.

 Not as sure about ZFS' stability on Linux (those who run direct Unix
 derivatives seem to rave about it) and failure modes.
 ___
 CentOS mailing list
 CentOS@centos.org
 http://lists.centos.org/mailman/listinfo/centos




-- 
George Kontostanos
---
http://www.aisecure.net
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread John R Pierce
On 10/24/2013 4:12 PM, Lists wrote:
 On 10/24/2013 02:47 PM, SilverTip257 wrote:
 You didn't mention XFS.
 Just curious if you considered it or not.
 Most definitely. There are a few features that I'm looking for:

 1) MOST IMPORTANT: STABLE!

XFS is quite stable in CentOS 6.4 64bit.
there was a flakey kernel issue circa 6.2.

 2) The ability to make the partition  bigger by adding drives with very
 minimal/no downtime.

XFS+LVM+mdraid does this, but it requires several manual steps...

I'd take the new drives, add them to a new md mirror, then add that md 
device to the volume group, then lvextend the logical volume, and 
finally xfs_grow the file system.  yes, thats a bunch more steps than 
the zpool/zfs commands, but in fact zfs is doing much the same thing 
internally.

I believe lvm also lets you replace pv's in the vg with new larger 
ones.   I haven't had to do this yet.


-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Warren Young
On 10/24/2013 17:12, Lists wrote:

 2) The ability to make the partition  bigger by adding drives with very
 minimal/no downtime.

Be careful: you may have been reading some ZFS hype that turns out not 
as rosy in reality.

Ideally, ZFS would work like a Drobo with an infinite number of drive 
bays.  Need to add 1 TB of disk space or so?  Just whack another 1 TB 
disk into the pool, no problem, right?

Doesn't work like that.

You can add another disk to an existing pool, but it doesn't instantly 
make the pool bigger.  You can make it a hot spare, but you can't tell 
ZFS to expand the pool over the new drive.

But, you say, didn't I read that   Yes, you did.  ZFS *can* do 
what you want, just not in the way you were probably expecting.

The least complicated *safe* way to add 1 TB to a pool is add *two* 1 TB 
disks to the system, create a ZFS mirror out of them, and add *that* 
vdev to the pool.  That gets you 1 TB of redundant space, which is what 
you actually wanted.  Just realize, you now have two separate vdevs 
here, both providing storage space to a single pool.

You could instead turn that new single disk into a non-redundant 
separate vdev and add that to the pool, but then that one disk can take 
down the entire pool if it dies.

Another problem is that you have now created a system where ZFS has to 
guess which vdev to put a given block of data on.  Your 2-disk mirror of 
newer disks probably runs faster than the old 3+ disk raidz vdev, but 
ZFS isn't going to figure that out on its own.  There are ways to 
encourage ZFS to use one vdev over another.  There's even a special 
case mode where you can tell it about an SSD you've added to act purely 
as an intermediary cache, between the spinning disks and the RAM caches.

The more expensive way to go -- which is simpler in the end -- is to 
replace each individual disk in the existing pool with a larger one, 
letting ZFS resilver each new disk, one at a time.  Once all disks have 
been replaced, *then* you can grow that whole vdev, and thus the pool.

But, XFS and ext4 can do that, too.  ZFS only wins when you want to add 
space by adding vdevs.

 3) The ability to remove an older, (smaller) drive or drives in order to
 replace with larger capacity drives without downtime or having to copy
 over all the files manually.

Some RAID controllers will let you do this.  XFS and ext4 have specific 
support for growing an existing filesystem to fill a larger volume.

 6) Reasonable failure mode. Things *do* go south sometimes. Simple is
 better, especially when it's simpler for the (typically highly stressed)
 administrator.

I find it simpler to use ZFS to replace a failed disk than any RAID BIOS 
or RAID management tool I've ever used.  ZFS's command line utilities 
are quite simply slick.  It's an under-hyped feature of the filesystem, 
if anything.

A lot of thought clearly went into the command language, so that once 
you learn a few basics, you can usually guess the right command in any 
given situation.  That sort of good design doesn't happen by itself.

All other disk management tools I've used seem to have just accreted 
features until they're a pile of crazy.  The creators of ZFS came along 
late enough in the game that they were able to look at everything and 
say, No no no, *this* is how you do it.

 I think ZFS and BTRFS are the only candidates that claim to do all the
 above. Btrfs seems to have been stable in a year or so for as long as
 I could keep a straight face around the word Gigabyte, so it's a
 non-starter at this point.

I don't think btrfs's problem is stability as much as lack of features. 
  It only just got parity redundancy (RAID-5/6) features recently, for 
example.

It's arguably been *stable* since it appeared in release kernels about 
four years ago.

One big thing may push you to btrfs: With ZFS on Linux, you have to 
patch your local kernels, and you can't then sell those machines as-is 
outside the company.  Are you willing to keep those kernels patched 
manually, whenever a new fix comes down from upstream?  Do your servers 
spend their whole life in house?

 Not as sure about ZFS' stability on Linux (those who run direct Unix
 derivatives seem to rave about it) and failure modes.

It wouldn't surprise me if ZFS on Linux is less mature than on Solaris 
and FreeBSD, purely due to the age of the effort.

Here, we've been able to use FreeBSD on the big ZFS storage box, and 
share it out to the Linux and Windows boxes over NFS and Samba.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Warren Young
On 10/24/2013 14:59, John R Pierce wrote:
 On 10/24/2013 1:41 PM, Lists wrote:

 1) you need a LOT of ram for decent performance on large zpools. 1GB ram
 above your basic system/application requirements per terabyte of zpool
 is not unreasonable.

To be fair, you want to treat XFS the same way.

And it, too is unstable on 32-bit systems with anything but smallish 
filesystems, due to lack of RAM.
___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread John R Pierce
On 10/24/2013 5:31 PM, Warren Young wrote:
 To be fair, you want to treat XFS the same way.

 And it, too is unstable on 32-bit systems with anything but smallish
 filesystems, due to lack of RAM.

I thought it had stack requirements that 32 bit couldn't meet, and it 
would simply crash, so it is not built into 32bit versions of EL6.



-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread John R Pierce
On 10/24/2013 5:29 PM, Warren Young wrote:
 The least complicated*safe*  way to add 1 TB to a pool is add*two*  1 TB
 disks to the system, create a ZFS mirror out of them, and add*that*  
 vdev to the pool.  That gets you 1 TB of redundant space, which is what
 you actually wanted.  Just realize, you now have two separate vdevs
 here, both providing storage space to a single pool.

yeah, I guess I should have made that clearer, thats exactly what you do.


and, it doesn't restripe old files til they get rewritten.   new stuff 
will be striped across all the vdevs, old stuff stays where it is.



-- 
john r pierce  37N 122W
somewhere on the middle of the left coast

___
CentOS mailing list
CentOS@centos.org
http://lists.centos.org/mailman/listinfo/centos


Re: [CentOS] ZFS on Linux in production?

2013-10-24 Thread Lists
On 10/24/2013 05:29 PM, Warren Young wrote:
 On 10/24/2013 17:12, Lists wrote:
 2) The ability to make the partition  bigger by adding drives with very
 minimal/no downtime.
 Be careful: you may have been reading some ZFS hype that turns out not
 as rosy in realiIdeally, ZFS would work like a Drobo with an infinite number 
 of drive
 bays.  Need to add 1 TB of disk space or so?  Just whack another 1 TB
 disk into the pool, no problem, right?

 Doesn't work like that.

 You can add another disk to an existing pool, but it doesn't instantly
 make the pool bigger.  You can make it a hot spare, but you can't tell
 ZFS to expand the pool over the new drive.

 But, you say, didn't I read that   Yes, you did.  ZFS *can* do
 what you want, just not in the way you were probably expecting.

 The least complicated *safe* way to add 1 TB to a pool is add *two* 1 TB
 disks to the system, create a ZFS mirror out of them, and add *that*
 vdev to the pool.  That gets you 1 TB of redundant space, which is what
 you actually wanted.  Just realize, you now have two separate vdevs
 here, both providing storage space to a single pool.

 You could instead turn that new single disk into a non-redundant
 separate vdev and add that to the pool, but then that one disk can take
 down the entire pool if it dies.

We have redundancy at the server/host level, so even if we have a 
fileserver go completely offline,
our application retains availability. We have an API in our application 
stack that negotiates with the (typically 2 or 3) file stores.

 Another problem is that you have now created a system where ZFS has to
 guess which vdev to put a given block of data on.  Your 2-disk mirror of
 newer disks probably runs faster than the old 3+ disk raidz vdev, but
 ZFS isn't going to figure that out on its own.  There are ways to
 encourage ZFS to use one vdev over another.  There's even a special
 case mode where you can tell it about an SSD you've added to act purely
 as an intermediary cache, between the spinning disks and the RAM caches.
Performance isn't so much an issue - we'd partition our cluster and 
throw a few more boxes into place if it became a bottle neck.

 The more expensive way to go -- which is simpler in the end -- is to
 replace each individual disk in the existing pool with a larger one,
 letting ZFS resilver each new disk, one at a time.  Once all disks have
 been replaced, *then* you can grow that whole vdev, and thus the pool.
Not sure enough of the vernacular but lets say you have 4 drives in a 
RAID 1 configuration, 1 set of TB drives and another set of 2 TB drives.

A1 - A2 = 2x 1TB drives, 1 TB redundant storage.
B1 - B2 = 2x 2TB drives, 2 TB redundant storage.

We have 3 TB of available storage. Are you suggesting we add a couple of 
4 TB drives:

A1 - A2 = 2x 1TB drives, 1 TB redundant storage.
B1 - B2 = 2x 2TB drives, 2 TB redundant storage.
C1 - C2 = 2x 4TB drives, 4 TB redundant storage.

Then wait until ZFS moves A1/A2 over to C1/C2 before removing A1/A2? If 
so, that's capability I'm looking for.

 But, XFS and ext4 can do that, too.  ZFS only wins when you want to add
 space by adding vdevs.

The only way I'm aware of ext4 doing this is with resizee2fs, which is 
extending a partition on a block device. The only way to do that with 
multiple disks is to use a virtual block device like LVM/LVM2 which (as 
I've stated before) I'm hesitant to do.

 3) The ability to remove an older, (smaller) drive or drives in order to
 replace with larger capacity drives without downtime or having to copy
 over all the files manually.
 Some RAID controllers will let you do this.  XFS and ext4 have specific
 support for growing an existing filesystem to fill a larger volume.

LVM2 will let you remove a drive without taking it offline. Can XFS do 
this without some block device virtualization like LVM2? (I didn't think 
so)

 6) Reasonable failure mode. Things *do* go south sometimes. Simple is
 better, especially when it's simpler for the (typically highly stressed)
 administrator.
 I find it simpler to use ZFS to replace a failed disk than any RAID BIOS
 or RAID management tool I've ever used.  ZFS's command line utilities
 are quite simply slick.  It's an under-hyped feature of the filesystem,
 if anything.

 A lot of thought clearly went into the command language, so that once
 you learn a few basics, you can usually guess the right command in any
 given situation.  That sort of good design doesn't happen by itself.

 All other disk management tools I've used seem to have just accreted
 features until they're a pile of crazy.  The creators of ZFS came along
 late enough in the game that they were able to look at everything and
 say, No no no, *this* is how you do it.

I sooo hear your music here! What really sucks about filesystem 
management is that at the time when you really need to get it right is 
when everything seems to be the most complex.

 I think ZFS and BTRFS are the only candidates that claim to do