Re: Got a call at 4am - RAID Gurus Please Read

2014-12-12 Thread Rob Seastrom

Jon Lewis jle...@lewis.org writes:

 OpenSolaris (or even Solaris 11), ZFS, Stable.  Pick one.  Maybe
 two. Three?  Yeah right.  Anyone who's used it hard, under heavy load,
 should understand.

The most recent release of OpenSolaris was over 5 years ago.  You're
working from (extremely) dated information.

The current FOSS Solaris ecosystem forked when Oracle brought stuff
back in-house.  Significant development has happened over the
intervening half-decade.  

Anyone who's using Nexentastor (or hosted in Joyent Cloud) is getting
all three (supra).

-r



Re: Got a call at 4am - RAID Gurus Please Read

2014-12-12 Thread Barry Shein

That might be close enough. I need to set up a test system and play
around with zfs and btrfs.

Thanks.

On December 11, 2014 at 21:29 mysi...@gmail.com (Jimmy Hess) wrote:
  On Thu, Dec 11, 2014 at 9:05 PM, Barry Shein b...@world.std.com wrote:
  [snip]
   From my reading the closest you can get to disk space quotas in ZFS is
   by limiting on a per directory (dataset, mount) basis which is similar
   but different.
  
  This is the normal type of quota within ZFS.   it is applied to a
  dataset and limits the size of the dataset, such as
  home/username.
  You can have as many datasets (filesystems) as you like  (within
  practical limits),  which is probably the way to go in regards to home
  directories.
  
  But another option is
  
  zfs set groupquota@groupname=100GB   example1/blah
  zfs set userquota@user1=200MB   example1/blah
  
  This would be available on the  Solaris implementation.
  
  
  I am not 100% certain that this is available under the BSD implementations,
  even if QUOTA is enabled in your kernel config.
  
  In the past the BSD implementation of ZFS never seemed to be as
  stable, functional, or performant as the OpenSolaris/Illumos version.
  
  --
  -JH

-- 
-Barry Shein

The World  | b...@theworld.com   | http://www.TheWorld.com
Purveyors to the Trade | Voice: 800-THE-WRLD| Dial-Up: US, PR, Canada
Software Tool  Die| Public Access Internet | SINCE 1989 *oo*


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Rob Seastrom

Gary Buhrmaster gary.buhrmas...@gmail.com writes:

 There is always Illumos/OnmiOS/SmartOS
 to consider (depending on your particular
 requirements) which can do ZFS and KVM.

2.5-year SmartOS user here.  Generally speaking pretty good though I
have my list of gripes like everything else I touch.

-r



Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Bacon Zombie
Are you running ZFS and RAIDZ on Linux or BSD?
On 10 Dec 2014 23:21, Javier J jav...@advancedmachines.us wrote:

 I'm just going to chime in here since I recently had to deal with bit-rot
 affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)

 We couldn't rebuild because of 5 URE sectors on one of the other disks in
 the array after a power / ups issue rebooted our storage box.

 We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
 using ZFS years ago?

 +1 for ZFS and RAIDZ



 On Wed, Dec 10, 2014 at 8:40 AM, Rob Seastrom r...@seastrom.com wrote:

 
  The subject is drifting a bit but I'm going with the flow here:
 
  Seth Mos seth@dds.nl writes:
 
   Raid10 is the only valid raid format these days. With the disks as big
   as they get these days it's possible for silent corruption.
 
  How do you detect it?  A man with two watches is never sure what time it
  is.
 
  Unless you have a filesystem that detects and corrects silent
  corruption, you're still hosed, you just don't know it yet.  RAID10
  between the disks in and of itself doesn't help.
 
   And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
   rebuilds that take a week, literally. Although the rebuild rate on our
   11 disk raid 6 SSD array (2TB) is less then a day.
 
  I did a rebuild on a RAIDZ2 vdev recently (made out of 4tb WD reds).
  It took nowhere near a day let alone a week.  Theoretically takes 8-11
  hours if the vdev is completely full, proportionately less if it's
  not, and I was at about 2/3 in use.
 
  -r
 
 



Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Ryan Brooks
Zfs on BSD or a Solaris like OS


 On Dec 11, 2014, at 10:06 AM, Bacon Zombie baconzom...@gmail.com wrote:
 
 Are you running ZFS and RAIDZ on Linux or BSD?
 On 10 Dec 2014 23:21, Javier J jav...@advancedmachines.us wrote:
 
 I'm just going to chime in here since I recently had to deal with bit-rot
 affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)
 
 We couldn't rebuild because of 5 URE sectors on one of the other disks in
 the array after a power / ups issue rebooted our storage box.
 
 We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
 using ZFS years ago?
 
 +1 for ZFS and RAIDZ
 
 
 
 On Wed, Dec 10, 2014 at 8:40 AM, Rob Seastrom r...@seastrom.com wrote:
 
 
 The subject is drifting a bit but I'm going with the flow here:
 
 Seth Mos seth@dds.nl writes:
 
 Raid10 is the only valid raid format these days. With the disks as big
 as they get these days it's possible for silent corruption.
 
 How do you detect it?  A man with two watches is never sure what time it
 is.
 
 Unless you have a filesystem that detects and corrects silent
 corruption, you're still hosed, you just don't know it yet.  RAID10
 between the disks in and of itself doesn't help.
 
 And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
 rebuilds that take a week, literally. Although the rebuild rate on our
 11 disk raid 6 SSD array (2TB) is less then a day.
 
 I did a rebuild on a RAIDZ2 vdev recently (made out of 4tb WD reds).
 It took nowhere near a day let alone a week.  Theoretically takes 8-11
 hours if the vdev is completely full, proportionately less if it's
 not, and I was at about 2/3 in use.
 
 -r
 


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Rob Seastrom

+1 on both.  Mostly SmartOS, some FreeNAS (which is FreeBSD underneath).

-r

Ryan Brooks r...@hack.net writes:

 Zfs on BSD or a Solaris like OS


 On Dec 11, 2014, at 10:06 AM, Bacon Zombie baconzom...@gmail.com wrote:
 
 Are you running ZFS and RAIDZ on Linux or BSD?
 On 10 Dec 2014 23:21, Javier J jav...@advancedmachines.us wrote:
 
 I'm just going to chime in here since I recently had to deal with bit-rot
 affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)
 
 We couldn't rebuild because of 5 URE sectors on one of the other disks in
 the array after a power / ups issue rebooted our storage box.
 
 We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
 using ZFS years ago?
 
 +1 for ZFS and RAIDZ
 
 
 
 On Wed, Dec 10, 2014 at 8:40 AM, Rob Seastrom r...@seastrom.com wrote:
 
 
 The subject is drifting a bit but I'm going with the flow here:
 
 Seth Mos seth@dds.nl writes:
 
 Raid10 is the only valid raid format these days. With the disks as big
 as they get these days it's possible for silent corruption.
 
 How do you detect it?  A man with two watches is never sure what time it
 is.
 
 Unless you have a filesystem that detects and corrects silent
 corruption, you're still hosed, you just don't know it yet.  RAID10
 between the disks in and of itself doesn't help.
 
 And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
 rebuilds that take a week, literally. Although the rebuild rate on our
 11 disk raid 6 SSD array (2TB) is less then a day.
 
 I did a rebuild on a RAIDZ2 vdev recently (made out of 4tb WD reds).
 It took nowhere near a day let alone a week.  Theoretically takes 8-11
 hours if the vdev is completely full, proportionately less if it's
 not, and I was at about 2/3 in use.
 
 -r
 


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Barry Shein

From: Randy Bush ra...@psg.com
 We are now using ZFS RAIDZ and the question I ask myself is, why
 wasn't I using ZFS years ago?

because it is not production on linux, which i have to use because
freebsd does not have kvm/ganeti.  want zfs very very badly.  snif.

I keep reading zfs vs btrfs articles and...inconclusive.

My problem with both is I need quotas, both file and inode, and both
are weaker than ext4 on that, zfs is very weak on this, you can only
sort of simulate them.

-- 
-Barry Shein

The World  | b...@theworld.com   | http://www.TheWorld.com
Purveyors to the Trade | Voice: 800-THE-WRLD| Dial-Up: US, PR, Canada
Software Tool  Die| Public Access Internet | SINCE 1989 *oo*


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Rob Seastrom

Barry Shein b...@world.std.com writes:

 From: Randy Bush ra...@psg.com
 We are now using ZFS RAIDZ and the question I ask myself is, why
 wasn't I using ZFS years ago?

because it is not production on linux, which i have to use because
freebsd does not have kvm/ganeti.  want zfs very very badly.  snif.

 I keep reading zfs vs btrfs articles and...inconclusive.

 My problem with both is I need quotas, both file and inode, and both
 are weaker than ext4 on that, zfs is very weak on this, you can only
 sort of simulate them.

By file, you mean disk space used?  By whom and where?  Quotas and
reservations on a per-dataset basis are pretty darned well supported
in ZFS.  As for inodes, well, since there isn't really such a thing as
an inode in ZFS...  what exactly are you trying to do here?

-r



Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Jimmy Hess
As for conversion between RAID levels;  usually dump and restore are
your best bet.
Even if your controller HBA supports a RAID level migration;  for a
small array hosted in
a server,  dump and restore is your least risky bet for successful
execution;  you
really need to dump anyways,  even on a controller that supports
clever RAID level migrations
(The ServeRaid does not fall into this category),
there is the possibility that the operation fails,  leading to data
loss,  so backup first.

On Wed, Dec 10, 2014 at 2:49 AM, Seth Mos seth@dds.nl wrote:
 symack schreef op 9-12-2014 22:03:
[snip]
 Raid10 is the only valid raid format these days. With the disks as big
 as they get these days it's possible for silent corruption.

No!  Mistake.   It depends.

RAID6, RAID60, RAID-DP, RAIDZ3, and a few others are perfectly valid
RAID formats,
with sufficient sparing. You get fewer extra average random write IOPS
per spindle,
but better survivability,   particularly in the event of simultaneous
double failures or even a simultaneous triple-failure  or simultaneous
quadruple failure  (with appropriate RAID group sizing),  which are
not necessarily as rare as one might intuitively expect.

And silent corruption can be addressed partially via surface scanning
and built-in ECC on the hard drives,
then also  (for Non-SATA SAS/FC drives), the decent array subsystems
low-level formatted disks with larger sector size  at the time of
initialization  and slip in additional error correction data within
each chunk's metadata,  so silent corruption or  bit-flipping isn't
necessarily so silent on a decent piece of storage equipment.

If you need to have a configuration less than 12 disk drives, where
you require good performance
for many small random reads and writes,  and only cheap controllers
are an option,
then yeah you probably need Raid10, but not always.

In case you have a storage chassis with 16 disk drives, an integrated
RAID controller,
a solid 1 to 2gb NVRAM cache and a few gigabytes read cache,  then
RAID6 or RAID60,  or  (maybe) even RAID50 could be a solid option for
a wide number of use cases.

You really just need to calculate an upper bound on the right number
of spindles spread over the right number of host ports for the
workload adjusted based on which RAID level you pick with sufficient
cache  (taking into account the caching policy  and  including a
sufficiently large safety factor to encompass inherent uncertainties
in  spindle performance  and the level of variability for your
specific overall workload).


--
-JH


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Barry Shein

Disk space by uid (by group is a plus but not critical), like BSD and
EXTn. And the reason I put inode in quotes was to indicate that they
may not (certainly not) be called inodes but an upper limit to the
total number of files and directories, typically to stop a runaway
script or certain malicious or grossly irresponsible behavior.

From my reading the closest you can get to disk space quotas in ZFS is
by limiting on a per directory (dataset, mount) basis which is similar
but different.

On December 11, 2014 at 16:57 r...@seastrom.com (Rob Seastrom) wrote:
  
  Barry Shein b...@world.std.com writes:
  
   From: Randy Bush ra...@psg.com
   We are now using ZFS RAIDZ and the question I ask myself is, why
   wasn't I using ZFS years ago?
  
  because it is not production on linux, which i have to use because
  freebsd does not have kvm/ganeti.  want zfs very very badly.  snif.
  
   I keep reading zfs vs btrfs articles and...inconclusive.
  
   My problem with both is I need quotas, both file and inode, and both
   are weaker than ext4 on that, zfs is very weak on this, you can only
   sort of simulate them.
  
  By file, you mean disk space used?  By whom and where?  Quotas and
  reservations on a per-dataset basis are pretty darned well supported
  in ZFS.  As for inodes, well, since there isn't really such a thing as
  an inode in ZFS...  what exactly are you trying to do here?
  
  -r

-- 
-Barry Shein

The World  | b...@theworld.com   | http://www.TheWorld.com
Purveyors to the Trade | Voice: 800-THE-WRLD| Dial-Up: US, PR, Canada
Software Tool  Die| Public Access Internet | SINCE 1989 *oo*


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Jimmy Hess
On Thu, Dec 11, 2014 at 9:05 PM, Barry Shein b...@world.std.com wrote:
[snip]
 From my reading the closest you can get to disk space quotas in ZFS is
 by limiting on a per directory (dataset, mount) basis which is similar
 but different.

This is the normal type of quota within ZFS.   it is applied to a
dataset and limits the size of the dataset, such as
home/username.
You can have as many datasets (filesystems) as you like  (within
practical limits),  which is probably the way to go in regards to home
directories.

But another option is

zfs set groupquota@groupname=100GB   example1/blah
zfs set userquota@user1=200MB   example1/blah

This would be available on the  Solaris implementation.


I am not 100% certain that this is available under the BSD implementations,
even if QUOTA is enabled in your kernel config.

In the past the BSD implementation of ZFS never seemed to be as
stable, functional, or performant as the OpenSolaris/Illumos version.

--
-JH


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Jon Lewis

On Thu, 11 Dec 2014, Jimmy Hess wrote:


I am not 100% certain that this is available under the BSD implementations,
even if QUOTA is enabled in your kernel config.

In the past the BSD implementation of ZFS never seemed to be as
stable, functional, or performant as the OpenSolaris/Illumos version.


That's a scary low bar for comparison.
OpenSolaris (or even Solaris 11), ZFS, Stable.  Pick one.  Maybe two. 
Three?  Yeah right.  Anyone who's used it hard, under heavy load, should 
understand.


--
 Jon Lewis, MCP :)   |  I route
 |  therefore you are
_ http://www.lewis.org/~jlewis/pgp for PGP public key_


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-11 Thread Javier J
Hey guys, I am running it on freeBSD. (nas4free)

It's my understanding that when a resilver happens in a zpool, only the
data that has actually been written to the disks gets used, not the whole
array like traditional raid5 does, reading even empty blocks. I know I
should be using RAIDZ2 for this size array, but I have daily backups off of
this array and also this is a lab, not a production environment. In a
production environment I would use raidz2 or raidz3. The bottom line is
even just Raidz1 is way better than any RAID5 hardware/software solution I
have come across. 1 disk with ZFS can survive 1/8 of the disk becoming
destroyed apparently. ZFS itself has many protections against data
corruption. Also I have scheduled a zpool scrub to run twice a week (to
detect bitrot before it happens.)

Anyway. I have been using linux raid since it has been available and I ask
myself, why haven't I used ZFS seriously before now.

- J

On Thu, Dec 11, 2014 at 11:06 AM, Bacon Zombie baconzom...@gmail.com
wrote:

 Are you running ZFS and RAIDZ on Linux or BSD?
 On 10 Dec 2014 23:21, Javier J jav...@advancedmachines.us wrote:

 I'm just going to chime in here since I recently had to deal with bit-rot
 affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)

 We couldn't rebuild because of 5 URE sectors on one of the other disks in
 the array after a power / ups issue rebooted our storage box.

 We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
 using ZFS years ago?

 +1 for ZFS and RAIDZ



 On Wed, Dec 10, 2014 at 8:40 AM, Rob Seastrom r...@seastrom.com wrote:

 
  The subject is drifting a bit but I'm going with the flow here:
 
  Seth Mos seth@dds.nl writes:
 
   Raid10 is the only valid raid format these days. With the disks as big
   as they get these days it's possible for silent corruption.
 
  How do you detect it?  A man with two watches is never sure what time it
  is.
 
  Unless you have a filesystem that detects and corrects silent
  corruption, you're still hosed, you just don't know it yet.  RAID10
  between the disks in and of itself doesn't help.
 
   And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
   rebuilds that take a week, literally. Although the rebuild rate on our
   11 disk raid 6 SSD array (2TB) is less then a day.
 
  I did a rebuild on a RAIDZ2 vdev recently (made out of 4tb WD reds).
  It took nowhere near a day let alone a week.  Theoretically takes 8-11
  hours if the vdev is completely full, proportionately less if it's
  not, and I was at about 2/3 in use.
 
  -r
 
 




Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Seth Mos
symack schreef op 9-12-2014 22:03:
 * Can I change from an active (ie, disks with data) raid 5 to raid 10.
 There are 4 drives

Dump and restore. I've used Acronis succesfully in the past and today,
they have a bootable ISO. Also, if you have the option, they have
universal restore so you can restore Windows on another piece of
hardware (you provide the drivers).

 in the unit, and I have two on the shelf that I can plug in.
 * If so, will I have less of performance impact with RAID 10 + write-thru
 then RAID 5 + write through

Raid10 is the only valid raid format these days. With the disks as big
as they get these days it's possible for silent corruption.

And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
rebuilds that take a week, literally. Although the rebuild rate on our
11 disk raid 6 SSD array (2TB) is less then a day.

If it accepts sata drives, consider just using SSDs instead. They're
just 600 euros for a 800GB drive. (Intel S3500)

 Given I can move from RAID 5 to RAID 10 without loosing data. How long to
 anticipate downtime for this process? Is there heavy sector re-arranging
 happening here? And the same for write-thru, is it done quick?

Heavy sectory re-arranging, yes, so just dump and restore, it's faster
and more reliable. Also, you then have a working bare metal restore backup.

Regards,

Seth


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Stuart Henderson
On 2014-12-09, symack sym...@gmail.com wrote:
 Server down. Got to colo at 4:39 and an old IBM X346 node with
 Serveraid-7k has failed. Opened it up to find a swollen cache battery that
 has bent the card in three different axis.

 * Can I change from an active (ie, disks with data) raid 5 to raid 10.

Even if the hw/firmware supports it, raid level migration is risky enough
at the best of times, and totally insane on a known-bad controller.



Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Rob Seastrom

The subject is drifting a bit but I'm going with the flow here:

Seth Mos seth@dds.nl writes:

 Raid10 is the only valid raid format these days. With the disks as big
 as they get these days it's possible for silent corruption.

How do you detect it?  A man with two watches is never sure what time it is.

Unless you have a filesystem that detects and corrects silent
corruption, you're still hosed, you just don't know it yet.  RAID10
between the disks in and of itself doesn't help.

 And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
 rebuilds that take a week, literally. Although the rebuild rate on our
 11 disk raid 6 SSD array (2TB) is less then a day.

I did a rebuild on a RAIDZ2 vdev recently (made out of 4tb WD reds).
It took nowhere near a day let alone a week.  Theoretically takes 8-11
hours if the vdev is completely full, proportionately less if it's
not, and I was at about 2/3 in use.

-r



Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Javier J
I'm just going to chime in here since I recently had to deal with bit-rot
affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)

We couldn't rebuild because of 5 URE sectors on one of the other disks in
the array after a power / ups issue rebooted our storage box.

We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
using ZFS years ago?

+1 for ZFS and RAIDZ



On Wed, Dec 10, 2014 at 8:40 AM, Rob Seastrom r...@seastrom.com wrote:


 The subject is drifting a bit but I'm going with the flow here:

 Seth Mos seth@dds.nl writes:

  Raid10 is the only valid raid format these days. With the disks as big
  as they get these days it's possible for silent corruption.

 How do you detect it?  A man with two watches is never sure what time it
 is.

 Unless you have a filesystem that detects and corrects silent
 corruption, you're still hosed, you just don't know it yet.  RAID10
 between the disks in and of itself doesn't help.

  And with 4TB+ disks that is a real thing.  Raid 6 is ok, if you accept
  rebuilds that take a week, literally. Although the rebuild rate on our
  11 disk raid 6 SSD array (2TB) is less then a day.

 I did a rebuild on a RAIDZ2 vdev recently (made out of 4tb WD reds).
 It took nowhere near a day let alone a week.  Theoretically takes 8-11
 hours if the vdev is completely full, proportionately less if it's
 not, and I was at about 2/3 in use.

 -r




Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Joe Greco
 I'm just going to chime in here since I recently had to deal with bit-rot
 affecting a 6TB linux raid5 setup using mdadm (6x 1TB disks)
 
 We couldn't rebuild because of 5 URE sectors on one of the other disks in
 the array after a power / ups issue rebooted our storage box.
 
 We are now using ZFS RAIDZ and the question I ask myself is, why wasn't I
 using ZFS years ago?
 
 +1 for ZFS and RAIDZ

I hope you are NOT using RAIDZ.  The chances of an error showing up
during a resilver is uncomfortably high and there are no automatic 
tools to fix pool corruption with ZFS.  Ideally use RAIDZ2 or RAIDZ3
to provide more appropriate levels of protection.  Errors introduced
into a pool can cause substantial unrecoverable damage to the pool,
so you really want the bitrot detection and correction mechanisms to
be working as designed.

... JG
-- 
Joe Greco - sol.net Network Services - Milwaukee, WI - http://www.sol.net
We call it the 'one bite at the apple' rule. Give me one chance [and] then I
won't contact you again. - Direct Marketing Ass'n position on e-mail spam(CNN)
With 24 million small businesses in the US alone, that's way too many apples.


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Randy Bush
 We are now using ZFS RAIDZ and the question I ask myself is, why
 wasn't I using ZFS years ago?

because it is not production on linux, which i have to use because
freebsd does not have kvm/ganeti.  want zfs very very badly.  snif.

randy


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Gary Buhrmaster
On Thu, Dec 11, 2014 at 2:25 AM, Randy Bush ra...@psg.com wrote:
 We are now using ZFS RAIDZ and the question I ask myself is, why
 wasn't I using ZFS years ago?

 because it is not production on linux,

Well, it depends on what you mean by
production.  Certainly the ZFS on Linux
group has said in some forums that it is
production ready, although I would say
that their definition is not exactly the
same as what I mean by the term.

 which i have to use because
 freebsd does not have kvm/ganeti.

There is bhyve, and virt-manager can
support bhyve in later versions (but is
disabled by default as I recall).  Not
exactly the same, of course.

 want zfs very very badly.  snif.

Anyone who really cares about their data
wants ZFS.  Some just do not yet know
that they (should) want it.

There is always Illumos/OnmiOS/SmartOS
to consider (depending on your particular
requirements) which can do ZFS and KVM.


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-10 Thread Randy Bush
zfs and ganeti
-- 
Phones are not computers and suck for email

On December 11, 2014 2:39:19 PM GMT+09:00, Gary Buhrmaster 
gary.buhrmas...@gmail.com wrote:
On Thu, Dec 11, 2014 at 2:25 AM, Randy Bush ra...@psg.com wrote:
 We are now using ZFS RAIDZ and the question I ask myself is, why
 wasn't I using ZFS years ago?

 because it is not production on linux,

Well, it depends on what you mean by
production.  Certainly the ZFS on Linux
group has said in some forums that it is
production ready, although I would say
that their definition is not exactly the
same as what I mean by the term.

 which i have to use because
 freebsd does not have kvm/ganeti.

There is bhyve, and virt-manager can
support bhyve in later versions (but is
disabled by default as I recall).  Not
exactly the same, of course.

 want zfs very very badly.  snif.

Anyone who really cares about their data
wants ZFS.  Some just do not yet know
that they (should) want it.

There is always Illumos/OnmiOS/SmartOS
to consider (depending on your particular
requirements) which can do ZFS and KVM.


Got a call at 4am - RAID Gurus Please Read

2014-12-09 Thread symack
Server down. Got to colo at 4:39 and an old IBM X346 node with
Serveraid-7k has failed. Opened it up to find a swollen cache battery that
has bent the card in three different axis. Separated the battery. (i)
Inspect card and plug back in, (ii) reboot, and got (code 2807) Not
functioning
Return to (i) x3 got same result. Dusted her off and let it sit for a while
plugged in, rebooted to see if I can get her to write-through mode, disks
start spinning. Horay.

Plan of action, (and the reason for my post):

* Can I change from an active (ie, disks with data) raid 5 to raid 10.
There are 4 drives
in the unit, and I have two on the shelf that I can plug in.
* If so, will I have less of performance impact with RAID 10 + write-thru
then RAID 5 + write through
* When the new raid card comes in, can I just plug it in without loosing my
data? I would:

i) RAID 10
ii) Write-thru
iii) Replace card

The new card is probably coming with a bad battery that would put us kind
of in square one. New batteries are 200+ if I can find them. Best case
scenario is move it over to RAID 10+Write-thru, and feel less of the
performance pinch.

Given I can move from RAID 5 to RAID 10 without loosing data. How long to
anticipate downtime for this process? Is there heavy sector re-arranging
happening here? And the same for write-thru, is it done quick?

I'm going to go lay down just for a little white.

Thanks in Advance,

Nick from Toronto.


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-09 Thread Michael Brown
If the serveraid7k cards are LSI and not Adaptec based (I think they are) you 
should just be able to plug in a new adapter and import the foreign 
configuration.

You do have a good backup, yes?

Switching to write-through has already happened (unless you specified 
WriteBackModeEvenWithNoBBU - not the default) - these (LSI) cards ‎by default 
only WB when safe.

If WT, RAID10 much better perf. BUT you just can't migrate from R5 to R10 
non-destructively.

- Michael from Kitchener
  Original Message  
From: symack
Sent: Tuesday, December 9, 2014 16:04
To: nanog@nanog.org
Subject: Got a call at 4am - RAID Gurus Please Read

Server down. Got to colo at 4:39 and an old IBM X346 node with
Serveraid-7k has failed. Opened it up to find a swollen cache battery that
has bent the card in three different axis. Separated the battery. (i)
Inspect card and plug back in, (ii) reboot, and got (code 2807) Not
functioning
Return to (i) x3 got same result. Dusted her off and let it sit for a while
plugged in, rebooted to see if I can get her to write-through mode, disks
start spinning. Horay.

Plan of action, (and the reason for my post):

* Can I change from an active (ie, disks with data) raid 5 to raid 10.
There are 4 drives
in the unit, and I have two on the shelf that I can plug in.
* If so, will I have less of performance impact with RAID 10 + write-thru
then RAID 5 + write through
* When the new raid card comes in, can I just plug it in without loosing my
data? I would:

i) RAID 10
ii) Write-thru
iii) Replace card

The new card is probably coming with a bad battery that would put us kind
of in square one. New batteries are 200+ if I can find them. Best case
scenario is move it over to RAID 10+Write-thru, and feel less of the
performance pinch.

Given I can move from RAID 5 to RAID 10 without loosing data. How long to
anticipate downtime for this process? Is there heavy sector re-arranging
happening here? And the same for write-thru, is it done quick?

I'm going to go lay down just for a little white.

Thanks in Advance,

Nick from Toronto.


Re: Got a call at 4am - RAID Gurus Please Read

2014-12-09 Thread Allen McKinley Kitchen (gmail)
+1 on the most important statement below, from my point of view: RAID 5 and 
RAID 10 are totally separate animals and while you can set up a separate RAID 
10 array and migrate your data to it (as soon as possible!!!) you cannot 
migrate from 5 to 10 in place absent some utter magic that I am unaware of.

10 requires more raw drive space but offers significant write performance 
advantages when correctly configured (which isn't really too difficult). 5 is 
fine for protection against losing one drive, but 5 requires much more internal 
processing of writeable data before it begins the writes and, not too long ago, 
was considered completely inappropriate for applications with high numbers of 
writes, such as a transactional database.

Still, 5 is often used for database systems in casual installations just 
because it's easy, cheap (relatively) and modern fast boxes are fast enough. 

Ok, getting down off my RAID soapbox - good luck.

..Allen

 On Dec 9, 2014, at 17:22, Michael Brown mich...@supermathie.net wrote:
 
 If the serveraid7k cards are LSI and not Adaptec based (I think they are) you 
 should just be able to plug in a new adapter and import the foreign 
 configuration.
 
 You do have a good backup, yes?
 
 Switching to write-through has already happened (unless you specified 
 WriteBackModeEvenWithNoBBU - not the default) - these (LSI) cards ‎by default 
 only WB when safe.
 
 If WT, RAID10 much better perf. BUT you just can't migrate from R5 to R10 
 non-destructively.
 
 - Michael from Kitchener
   Original Message  
 From: symack
 Sent: Tuesday, December 9, 2014 16:04
 To: nanog@nanog.org
 Subject: Got a call at 4am - RAID Gurus Please Read
 
 Server down. Got to colo at 4:39 and an old IBM X346 node with
 Serveraid-7k has failed. Opened it up to find a swollen cache battery that
 has bent the card in three different axis. Separated the battery. (i)
 Inspect card and plug back in, (ii) reboot, and got (code 2807) Not
 functioning
 Return to (i) x3 got same result. Dusted her off and let it sit for a while
 plugged in, rebooted to see if I can get her to write-through mode, disks
 start spinning. Horay.
 
 Plan of action, (and the reason for my post):
 
 * Can I change from an active (ie, disks with data) raid 5 to raid 10.
 There are 4 drives
 in the unit, and I have two on the shelf that I can plug in.
 * If so, will I have less of performance impact with RAID 10 + write-thru
 then RAID 5 + write through
 * When the new raid card comes in, can I just plug it in without loosing my
 data? I would:
 
 i) RAID 10
 ii) Write-thru
 iii) Replace card
 
 The new card is probably coming with a bad battery that would put us kind
 of in square one. New batteries are 200+ if I can find them. Best case
 scenario is move it over to RAID 10+Write-thru, and feel less of the
 performance pinch.
 
 Given I can move from RAID 5 to RAID 10 without loosing data. How long to
 anticipate downtime for this process? Is there heavy sector re-arranging
 happening here? And the same for write-thru, is it done quick?
 
 I'm going to go lay down just for a little white.
 
 Thanks in Advance,
 
 Nick from Toronto.