[zfs-discuss] RAID Z stripes

2010-08-10 Thread Terry Hull
I am wanting to build a server with 16 - 1TB drives with 2 ­ 8 drive RAID Z2
arrays striped together.   However, I would like the capability of adding
additional stripes of 2TB drives in the future.  Will this be a problem?   I
thought I read it is best to keep the stripes the same width and was
planning to do that, but I was wondering about using drives of different
sizes.  These drives would all be in a single pool.

--
Terry Hull
Network Resource Group, Inc.   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] RAID Z stripes

2010-08-10 Thread Terry Hull

 From: Phil Harman phil.har...@gmail.com
 Date: Tue, 10 Aug 2010 09:24:52 +0100
 To: Ian Collins i...@ianshome.com
 Cc: Terry Hull t...@nrg-inc.com, zfs-discuss@opensolaris.org
 zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] RAID Z stripes
 
 On 10 Aug 2010, at 08:49, Ian Collins i...@ianshome.com wrote:
 
 On 08/10/10 06:21 PM, Terry Hull wrote:
 I am wanting to build a server with 16 - 1TB drives with 2 ­ 8 dri
 ve RAID Z2 arrays striped together. However, I would like the capa
 bility of adding additional stripes of 2TB drives in the future. W
 ill this be a problem? I thought I read it is best to keep the str
 ipes the same width and was planning to do that, but I was wonderi
 ng about using drives of different sizes. These drives would all b
 e in a single pool.
 
 It would work, but you run the risk of the smaller drives becoming
 full and all new writes doing to the bigger vdev. So while usable,
 performance would suffer.
 
 Almost by definition, the 1TB drives are likely to be getting full
 when the new drives are added (presumably because of running out of
 space).
 
 Performance can only be said to suffer relative to a new pool built
 entirely with drives of the same size. Even if he added 8x 2TB drives
 in a RAIDZ3 config it is hard to predict what the performance gap will
 be (on the one hand: RAIDZ3 vs RAIDZ2, on the other: an empty group vs
 an almost full, presumably fragmented, group).
 
 One option would be to add 2TB drives as 5 drive raidz3 vdevs. That
 way your vdevs would be approximately the same size and you would
 have the optimum redundancy for the 2TB drives.
 
 I think you meant 6, but I don't see a good reason for matching the
 group sizes. I'm for RAIDZ3, but I don't see much logic in mixing
 groups of 6+2 x 1TB and 3+3 x 2TB in the same pool (in one group I
 appear to care most about maximising space, in the other I'm
 maximising availability)
 
 The other issue is that of hot spares. In a pool of mixed size drives
 you either waste array slots (by having spares of different sizes) or
 plan to have unavailable space when small drives are replaced by large
 ones.


So do I understand correctly that really the Right thing to do is to build
a pool not only with a consistent strip width, but also to build it with
drives on only one size?   It also sounds like from a practical point of
view that building the pool full-sized is the best policy so that the data
can be spread relatively uniformly across all the drives from the very
beginning.  In my case, I think what I will do is to start with the 16
drives in a single pool and when I need more space, I'll create a new pool
and manually move the some of the existing data to the new pool to spread
the IO load.   

The other issue here seems to be RAIDZ2 vs RAIDZ3.  I assume there is not a
significant performance difference between the two for most loads, but
rather I choose between them based on how badly I want the array to stay
intact.  

-
Terry



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] PowerEdge R510 with PERC H200/H700 with ZFS

2010-08-07 Thread Terry Hull

 From: Geoff Nordli geo...@gnaa.net
 Date: Sat, 7 Aug 2010 08:39:46 -0700
 To: zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] PowerEdge R510 with PERC H200/H700 with ZFS
 
 
 
 -Original Message-
 From: Brian Hechinger [mailto:wo...@4amlunch.net]
 Sent: Saturday, August 07, 2010 8:10 AM
 To: Geoff Nordli
 Subject: Re: [zfs-discuss] PowerEdge R510 with PERC H200/H700 with ZFS
 
 On Sat, Aug 07, 2010 at 08:00:11AM -0700, Geoff Nordli wrote:
 Anyone have any experience with a R510 with the PERC H200/H700
 controller with ZFS?
 
 Not that particular setup, but I do run Solaris on a Precision 690 with
 PERC 6i
 controllers.
 
 My perception is that Dell doesn't play well with OpenSolaris.
 
 What makes you say that?  I've run Solaris on quite a few Dell boxes and
 have
 never had any issues.
 
 -brian
 --
  
 Hi Brian.
 
 I am glad to hear that, because I would prefer to use a dell box.
 
 Is there a JBOD mode with the PERC 6i?
 
 It is funny how sometimes one forms these views as you gather information.
 
 Geoff   

It is just that lots of the PERC controllers do not do JBOD very well.  I've
done it several times making a RAID 0 for each drive.  Unfortunately, that
means the server has lots of RAID hardware that is not utilized very well.
Also, ZFS loves to see lots of spindles, and Dell boxes tend not to have
lots of drive bays in comparison to what you can build at a given price
point.   Of course then you have warranty / service issues to consider.

--
Terry Hull
Network Resource Group, Inc.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] PowerEdge R510 with PERC H200/H700 with ZFS

2010-08-07 Thread Terry Hull



 From: Geoff Nordli geo...@gnaa.net
 Date: Sat, 7 Aug 2010 14:11:37 -0700
 To: Terry Hull t...@nrg-inc.com, zfs-discuss@opensolaris.org
 Subject: RE: [zfs-discuss] PowerEdge R510 with PERC H200/H700 with ZFS
[stuff deleted]
 
 Terry, you are right, the part that was really missing with the Dell was the
 lack of spindles.  It seems the R510 can have up to 12 spindles.
 
 The online configurator only allows you to select SLC SSDs, which are a lot
 more expensive than the MLC versions.  It would be nice to do MLC since that
 works fine for L2ARC.
 
 I believe they have an onboard SD flash connector too.  It would be great to
 be able to install the base OS onto a flash card and not waste two drives.
 
 Are you using the Broadcom or Intel NICs?
  
 For sure the benefit of buying name brand is the warranty/service side of
 things, which is important to me.  I don't want to spend any time
 worrying/fixing boxes.

I understand that one.

I have been using both Intel and Broadcom NICs successfully.   My gut tells
me I like the Intel better, but I can't say that is because I have had
trouble with the Broadcom.  It is just a personal preference that I probably
can't justify.   

--
Terry 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS / Network tuning recommendations for iSCSI

2010-08-05 Thread Terry Hull
I have seen some network recommendations for tuning a small storage server
from the network side.  I am currently using this set and wondered if there
were other things I should be tweeking,

ndd -set /dev/tcp tcp_xmit_hiwat1048576
ndd -set /dev/tcp tcp_recv_hiwat8388608
ndd -set /dev/tcp tcp_max_buf   8388608
ndd -set /dev/udp udp_xmit_hiwat1048576
ndd -set /dev/udp udp_recv_hiwat8388608
ndd -set /dev/tcp tcp_conn_req_max_q65536
ndd -set /dev/tcp tcp_conn_req_max_q0   65536
ndd -set /dev/tcp tcp_fin_wait_2_flush_interval 67500
ndd -set /dev/tcp tcp_naglim_def1

I realize the UDP options have nothing to do with iSCSI, but I applied them
anyway as it seemed to make sense.

The machine I¹m using  has a 8 drive RAIDZ-2 with 1 TB drives a quad core
processor and 4 GB RAM.  98% of its load is as an iSCSI target for ESX.I
do have write caching turned on and have verified that turning it off causes
a significant write performance penalty.   I currently am not using bonded
NICS, but am using jumbo frames.   Are there other things I should be
tweeking?

--
Terry Hull
Network Resource Group, Inc.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Logical Units and ZFS send / receive

2010-08-04 Thread Terry Hull

I have a logical unit created with sbdadm create-lu that it I replicating
with zfs send  / receive between 2 build 134 hosts.   The these LUs are
iSCSI targets used as VMFS filesystems and ESX RDMs mounted on a Windows
2003 machine.   The zfs pool names are the same on both machines.  The
replication seems to be going correctly.  However, when I try to use the LUs
on the server I am replicating the data to, I have issues.   Here is the
scenario:  

The LUs are created as sparse.  Here is the process I¹m going through after
the snapshots are replicated to a secondary machine:
* Original machine:  svccfg export -a stmf  /tmp/stmf.cfg
* Copy stmf.cfg to second machine:
* Secondary machine:  svcadm disable stmf
* svccfg delete xtmf
* cd /var/svc/manifest
* svccfg import system/stmf.xml
* svcadm disable stmf
* svcadm import /tmp/stmf.cfg

At this point stmfadm list-lu ­v shows the SCSI LUs  all as ³unregistered²

When I try to import the LUs I get: stmfadm: meta data error

I am using the command:
stmfadm import-lu /dev/zvol/rdsk/pool-name

to import the LU

It is as if the pool does not exist.  However, I can verify that the pool
does actually exist with zfs list and with zfs list ­t snapshot to show the
snapshot that I replicated.


Any suggestions?  
--
Terry Hull
Network Resource Group, Inc.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Logical Units and ZFS send / receive

2010-08-04 Thread Terry Hull

 From: Richard Elling rich...@nexenta.com
 Date: Wed, 4 Aug 2010 11:05:21 -0700
 Subject: Re: [zfs-discuss] Logical Units and ZFS send / receive
 
 On Aug 3, 2010, at 11:58 PM, Terry Hull wrote:
 I have a logical unit created with sbdadm create-lu that it I replicating
 with zfs send  / receive between 2 build 134 hosts.   The these LUs are iSCSI
 targets used as VMFS filesystems and ESX RDMs mounted on a Windows 2003
 machine.   The zfs pool names are the same on both machines.  The replication
 seems to be going correctly.  However, when I try to use the LUs on the
 server I am replicating the data to, I have issues.   Here is the scenario:
 
 The LUs are created as sparse.  Here is the process I¹m going through after
 the snapshots are replicated to a secondary machine:
 
 How did you replicate? In b134, the COMSTAR metadata is placed in
 hidden parameters in the dataset. These are not transferred via zfs send,
 by default.  This metadata includes the LU.
  -- richard

Does the -p option on the zfs send solve that problem? What else is not sent
by default?   In other words, am I better off sending the metadata with the
zfs send, or am I better off just creating the GUID once I get the data
transferred?  

--
Terry Hull
Network Resource Group, Inc.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Logical Units and ZFS send / receive

2010-08-04 Thread Terry Hull



 From: Richard Elling rich...@nexenta.com
 Date: Wed, 4 Aug 2010 18:40:49 -0700
 To: Terry Hull t...@nrg-inc.com
 Cc: zfs-discuss@opensolaris.org zfs-discuss@opensolaris.org
 Subject: Re: [zfs-discuss] Logical Units and ZFS send / receive
 
 On Aug 4, 2010, at 1:27 PM, Terry Hull wrote:
 From: Richard Elling rich...@nexenta.com
 Date: Wed, 4 Aug 2010 11:05:21 -0700
 Subject: Re: [zfs-discuss] Logical Units and ZFS send / receive
 
 On Aug 3, 2010, at 11:58 PM, Terry Hull wrote:
 I have a logical unit created with sbdadm create-lu that it I replicating
 with zfs send  / receive between 2 build 134 hosts.   The these LUs are
 iSCSI
 targets used as VMFS filesystems and ESX RDMs mounted on a Windows 2003
 machine.   The zfs pool names are the same on both machines.  The
 replication
 seems to be going correctly.  However, when I try to use the LUs on the
 server I am replicating the data to, I have issues.   Here is the scenario:
 
 The LUs are created as sparse.  Here is the process I¹m going through after
 the snapshots are replicated to a secondary machine:
 
 How did you replicate? In b134, the COMSTAR metadata is placed in
 hidden parameters in the dataset. These are not transferred via zfs send,
 by default.  This metadata includes the LU.
 -- richard
 
 Does the -p option on the zfs send solve that problem?
 
 I am unaware of a zfs send -p option.  Did you mean the -R option?
 
 The LU metadata is stored in the stmf_sbd_lu property.  You should be able
 to get/set it.
 

On the source machine I did a

zfs get -H stmf_sbd_lu pool-name.  In my case that gave me

tank/iscsi/bg-man5-vmfs stmf_sbd_lu
554c4442534e555307020702
010001843000b7010100ff862005
00c01200
180009fff1030010600144f0fa354000
4c4f9edb0003





7461
6e6b2f69736373692f62672d6d616e352d766d6673002f6465762f7a766f6c2f7264736b2f74
616e6b2f69736373692f62672d6d616e352d766d667300e70100
002200ff080 local

(But it was all one line.)

I cut the numeric section out above and then did a

Zfs set stmf_sbd_lu=(above cut section) pool_name

And that seemed to work.  However, when I did a

stmfadm import_lu /dev/zvol/rdsk/pool

I still get meta file error

However, when I do a zfs get -H stmf_sbd_lu pool_name on the secondary
system, it now matches the results on the first system.

BTW:  The zfs send -p option is described as Send Properties

It seems like this should not be so hard to transfer an LU with zfs
send/receive.   


 What else is not sent
 by default?   In other words, am I better off sending the metadata with the
 zfs send, or am I better off just creating the GUID once I get the data
 transferred?  
 
 I don't think this is a GUID issue.
  -- richard
 
 -- 
 Richard Elling
 rich...@nexenta.com   +1-760-896-4422
 Enterprise class storage for everyone
 www.nexenta.com
 
--
Terry Hull


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS mirrored boot disks

2010-02-19 Thread Terry Hull
Interestingly, with the machine running, I can pull the first drive in the 
mirror, replace it with an unformatted one, format it, mirror rpool over to it, 
install the boot loader, and at that point the machine will boot with no 
problems.   It s just when the first disk is missing that I have a problem with 
it. 

--
Terry
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS mirrored boot disks

2010-02-18 Thread Terry Hull
I have a machine with the Supermicro 8 port SATA card installed.  I have had no 
problem creating a mirrored boot disk using the oft-repeated scheme:

prtvtoc /dev/rdsk/c4t0d0s2 | fmthard -s – /dev/rdsk/c4t1d0s2
zpool attach rpool c4t0d0s0 c4t1d0s0
wait for sync
installgrub -m /boot/grub/stage1 /boot/grub/stage2 /dev/rdsk/c4t1d0s0

Unfortunately when I shut the machine down and remove the primary boot disk, it 
will no longer boot.  I get the boot loader, and if I turn off the splash 
screen I see it get to the point of displaying the host name.  At that point, 
it hangs forever.   From the posts I've seen it looks like this is a very 
standard scheme that just works.  What can be missing with my procedure.  

I am running Build 132, if that matters.  

--
Terry
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS replication primary secondary

2010-02-10 Thread Terry Hull
First of all, I must apologize.   I'm an OpenSolaris newbie so please don't be 
too hard on me.  

Sorry if this has been beaten to death before, but I could not find it, so here 
goes.  I'm wanting to be able to have two disk servers that I replicate data 
between using send / receive with snapshots.   Yes, I know that is simple 
enough, but what happens when the primary server goes down and I actually need 
to make changes to the secondary?   Can I then replicate the data back to the 
primary server without starting over?TIA.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS replication primary secondary

2010-02-10 Thread Terry Hull
Thanks for the info.   

If that last common snapshot gets destroyed on the primary server, it is then a 
full replication back to the primary server.  Is that correct?

--
Terry
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss