Re: [zfs-discuss] Best practice for boot partition layout in ZFS

2011-04-06 Thread Torrey McMahon


On 4/6/2011 11:08 AM, Erik Trimble wrote:

Traditionally, the reason for a separate /var was one of two major items:

(a)  /var was writable, and / wasn't - this was typical of diskless or 
minimal local-disk configurations. Modern packaging systems are making 
this kind of configuration increasingly difficult.


(b) /var held a substantial amount of data, which needed to be handled 
separately from /  - mail and news servers are a classic example



For typical machines nowdays, with large root disks, there is very 
little chance of /var suddenly exploding and filling /  (the classic 
example of being screwed... wink).  Outside of the above two cases, 
about the only other place I can see that having /var separate is a 
good idea is for certain test machines, where you expect frequent  
memory dumps (in /var/crash) - if you have a large amount of RAM, 
you'll need a lot of disk space, so it might be good to limit /var in 
this case by making it a separate dataset.


Some more info ala (b) - The something filled up the root fs and the 
box crashed problem was fixed for awhile ago. It's still a drag 
cleaning up an errant process that is filling up a file system but it 
shouldn't crash/panic anymore. However, old habits die hard, especially 
at government sites where the rules require a papal bull to be changed, 
so I think the option was left to keep folks happy more than any 
practical reason.


I'm sure someone has a really good reason to keep /var separated but 
those cases are fewer and far between than I saw 10 years ago.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Performance

2011-02-28 Thread Torrey McMahon

On 2/25/2011 4:15 PM, Torrey McMahon wrote:

On 2/25/2011 3:49 PM, Tomas Ögren wrote:

On 25 February, 2011 - David Blasingame Oracle sent me these 2,6K bytes:


  Hi All,

  In reading the ZFS Best practices, I'm curious if this statement is
  still true about 80% utilization.

It happens at about 90% for me.. all of a sudden, the mail server got
butt slow.. killed an old snapshot to get to 85% free or so, then it got
snappy again. S10u9 sparc.


Some of the recent updates have pushed the 80% watermark closer to 90% 
for most workloads.


Sorry folks. I was thinking of yet an other change that was in the 
allocation algorithms. 80% is number to stick with.


... now where did I put my cold medicine? :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Performance

2011-02-25 Thread Torrey McMahon

On 2/25/2011 3:49 PM, Tomas Ögren wrote:

On 25 February, 2011 - David Blasingame Oracle sent me these 2,6K bytes:


  Hi All,

  In reading the ZFS Best practices, I'm curious if this statement is
  still true about 80% utilization.

It happens at about 90% for me.. all of a sudden, the mail server got
butt slow.. killed an old snapshot to get to 85% free or so, then it got
snappy again. S10u9 sparc.


Some of the recent updates have pushed the 80% watermark closer to 90% 
for most workloads.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [storage-discuss] multipath used inadvertantly?

2011-02-15 Thread Torrey McMahon
in.mpathd is the IP multipath daemon. (Yes, it's a bit confusing that 
mpathadm is the storage multipath admin tool. )


If scsi_vhci is loaded in the kernel you have storage multipathing 
enabled. (Check with modinfo.)


On 2/15/2011 3:53 PM, Ray Van Dolson wrote:

I'm troubleshooting an existing Solaris 10U9 server (x86 whitebox) and
noticed its device names are extremely hair -- very similar to the
multipath device names: c0t5000C50026F8ACAAd0, etc, etc.

mpathadm seems to confirm:

# mpathadm list lu
 /dev/rdsk/c0t50015179591CE0C1d0s2
 Total Path Count: 1
 Operational Path Count: 1

# ps -ef | grep mpath
 root   245 1   0   Jan 05 ?  16:38 /usr/lib/inet/in.mpathd -a

The system is SuperMicro based with an LSI SAS2008 controller in it.
To my knowledge it has no multipath capabilities (or at least not as
its wired up currently).

The mpt_sas driver is in use per prtconf and modinfo.

My questions are:

- What scenario would the multipath driver get loaded up at
   installation time for this LSI controller?  I'm guessing this is what
   happened?

- If I disabled mpathd would I get the shorter disk device names back
   again?  How would this impact existing zpools that are already on the
   system tied to these disks?  I have a feeling doing this might be a
   little bit painful. :)

I tried to glean the original device names from stmsboot -L, but it
didn't show any mappings...

Thanks,
Ray
___
storage-discuss mailing list
storage-disc...@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/storage-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One LUN per RAID group

2011-02-15 Thread Torrey McMahon


On 2/14/2011 10:37 PM, Erik Trimble wrote:
That said, given that SAN NVRAM caches are true write caches (and not 
a ZIL-like thing), it should be relatively simple to swamp one with 
write requests (most SANs have little more than 1GB of cache), at 
which point, the SAN will be blocking on flushing its cache to disk. 


Actually, most array controllers now have 10s if not 100s of GB of 
cache. The 6780 has 32GB, DMX-4 has - if I remember correctly - 256. The 
latest HDS box is probably close if not more.


Of course you still have to flush to disk and the cache flush algorithms 
of the boxes themselves come into play but 1GB was a long time ago.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Best choice - file system for system

2011-01-30 Thread Torrey McMahon

On 1/30/2011 5:26 PM, Joerg Schilling wrote:

Richard Ellingrichard.ell...@gmail.com  wrote:


ufsdump is the problem, not ufsrestore. If you ufsdump an active
file system, there is no guarantee you can ufsrestore it. The only way
to guarantee this is to keep the file system quiesced during the entire
ufsdump.  Needless to say, this renders ufsdump useless for backup
when the file system also needs to accommodate writes.

This is why there is a ufs snapshot utility.


You'll have the same problem. fssnap_ufs(1M) write locks the file system 
when you run the lock command. See the notes section of the man page.


http://download.oracle.com/docs/cd/E19253-01/816-5166/6mbb1kq1p/index.html#Notes


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] reliable, enterprise worthy JBODs?

2011-01-25 Thread Torrey McMahon

On 1/25/2011 2:19 PM, Marion Hakanson wrote:

The only special tuning I had to do was turn off round-robin load-balancing
in the mpxio configuration.  The Seagate drives were incredibly slow when
running in round-robin mode, very speedy without.


Interesting. Did you switch to the load-balance option?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How well does zfs mirror handle temporary disk offlines?

2011-01-18 Thread Torrey McMahon



On 1/18/2011 2:46 PM, Philip Brown wrote:

My specific question is, how easily does ZFS handle*temporary*  SAN 
disconnects, to one side of the mirror?
What if the outage is only 60 seconds?
3 minutes?
10 minutes?
an hour?


Depends on the multipath drivers and the failure mode. For example, if 
the link drops completely at the host hba connection some failover 
drivers will mark the path down immediately which will propagate up the 
stack faster than an intermittent connection or something father down 
stream failing.



If we have 2x1TB drives, in a simple zfs mirror if one side goes temporarily off 
line, will zfs attempt to resync **1 TB** when it comes back? Or does it have enough 
intelligence to say, oh hey I know this disk..and I know [these bits] are still 
good, so I just need to resync [that bit] ?


My understanding is yes though I can't find the reference for this. (I'm 
sure someone else will find it in short order.)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Changing GUID

2010-11-15 Thread Torrey McMahon
Are those really your requirements? What is it that you're trying to 
accomplish with the data? Make a copy and provide to an other host?


On 11/15/2010 5:11 AM, sridhar surampudi wrote:

Hi I am looking in similar lines,

my requirement is

1. create a zpool on one or many devices ( LUNs ) from an array ( array can be 
IBM or HPEVA or EMC etc.. not SS7000).
2. Create file systems on zpool
3. Once file systems are in use (I/0 is happening) I need to take snapshot at 
array level
  a. Freeze the zfs flle system ( not required due to zfs consistency : source 
: mailing groups)
  b. take array snapshot ( say .. IBM flash copy )
  c. Got new snapshot device (having same data and metadata including same GUID 
of source pool)

   Now I need a way to change the GUID and pool of snapshot device so that the 
snapshot device can be accessible on same host or an alternate host (if the LUN 
is shared).

Could you please post commands for the same.

Regards,
sridhar.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS no longer working with FC devices.

2010-05-23 Thread Torrey McMahon

 On 5/23/2010 11:49 AM, Richard Elling wrote:

FWIW, the A5100 went end-of-life (EOL) in 2001 and end-of-service-life
(EOSL) in 2006. Personally, I  hate them with a passion and would like to
extend an offer to use my tractor to bury the beast:-).


I'm sure I can get some others to help. Can I smash the gbics? Those 
were my favorite. :-)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mpxio load-balancing...it doesn't work??

2010-04-05 Thread Torrey McMahon
 Not true. There are different ways that a storage array, and it's 
controllers, connect to the host visible front end ports which might be 
confusing the author but i/o isn't duplicated as he suggests.


On 4/4/2010 9:55 PM, Brad wrote:

I had always thought that with mpxio, it load-balances IO request across your 
storage ports but this article 
http://christianbilien.wordpress.com/2007/03/23/storage-array-bottlenecks/ has 
got me thinking its not true.

The available bandwidth is 2 or 4Gb/s (200 or 400MB/s – FC frames are 10 bytes long 
-) per port. As load balancing software (Powerpath, MPXIO, DMP, etc.) are most of the 
times used both for redundancy and load balancing, I/Os coming from a host can take 
advantage of an aggregated bandwidth of two ports. However, reads can use only one path, 
but writes are duplicated, i.e. a host write ends up as one write on each host port.

Is this true?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] mpxio load-balancing...it doesn't work??

2010-04-05 Thread Torrey McMahon
 The author mentions multipathing software in the blog entry. Kind of 
hard to mix that up with cache mirroring if you ask me.


On 4/5/2010 9:16 PM, Brad wrote:

I'm wondering if the author is talking about cache mirroring where the cache 
is mirrored between both controllers.  If that is the case, is he saying that for every 
write to the active controlle,r a second write issued on the passive controller to keep 
the cache mirrored?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] demise of community edition

2010-01-31 Thread Torrey McMahon
This is a topic for indiana-discuss, not zfs-discuss. If you read 
through the archives of that alias you should see some pointers.


On 1/31/2010 11:38 AM, Tom Bird wrote:

Afternoon,

I note to my dismay that I can't get the community edition any more 
past snv_129, this version was closest to the normal way of doing 
things that I am used to with Solaris = 10, the standard OpenSolaris 
releases seem only to have this horrible Gnome based installer that 
gives you only one option - install everything.


Am I just doing it wrong or is there another way to get OpenSolaris 
installed in a sane manner other than just sticking with community 
edition at snv_129?



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [zones-discuss] Zones on shared storage - a warning

2010-01-08 Thread Torrey McMahon

On 1/8/2010 10:04 AM, James Carlson wrote:

Mike Gerdts wrote:
   

This unsupported feature is supported with the use of Sun Ops Center
2.5 when a zone is put on a NAS Storage Library.
 

Ah, ok.  I didn't know that.

   


Does anyone know how that works? I can't find it in the docs, no one 
inside of Sun seemed to have a clue when I asked around, etc. RTFM 
gladly taken.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and LiveUpgrade

2010-01-07 Thread Torrey McMahon
Make sure you have the latest LU patches installed. There were a lot of fixes 
put back in that area within the last six months or so. 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thin device support in ZFS?

2009-12-30 Thread Torrey McMahon


On 12/30/2009 2:40 PM, Richard Elling wrote:

There are a few minor bumps in the road. The ATA PASSTHROUGH
command, which allows TRIM to pass through the SATA drivers, was
just integrated into b130. This will be more important to small servers
than SANs, but the point is that all parts of the software stack need to
support the effort. As such, it is not clear to me who, if anyone, inside
Sun is champion for the effort -- it crosses multiple organizational
boundaries. 


I'd think it more important for devices where this is an issue, namely 
SSDs, then it is spinning rust though use of the TRIM command, or 
something like it, would fix a lot of the issues I've seen with thin 
provisioning over the last six years or so. However, I'm not sure it's 
going to be much of an impact until you can get the entire stack - 
application to device - rewired to work with the concept behind it. One 
of the biggest issues I've seen with thin provisioning is how the 
applications work and you can't fix that in the file system code.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] primarycache and secondarycache properties on Solaris 10 u8

2009-10-15 Thread Torrey McMahon

Suggest you start with the man page

http://docs.sun.com/app/docs/doc/819-2240/zfs-1m

On 10/15/2009 4:19 PM, Javier Conde wrote:


Hello,

I've seen in the what's new of Solaris 10 update 8 just released 
that ZFS now includes the primarycache and secondarycache properties.


Is this the equivalent of the UFS directio? Does it have a similar 
behavior?


I'm thinking about having a database on ZFS with this option, and 
Oracle recommends to have directio when working on top of a file system.


Thanks in advance and best regards,

Javi

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Petabytes on a budget - blog

2009-09-02 Thread Torrey McMahon

As some Sun folks pointed out

1) No redundancy at the power or networking side
2) Getting 2TB drives in a x4540 would make the numbers closer
3) Performance isn't going to be that great with their design but...they 
might not need it.



On 9/2/2009 2:13 PM, Michael Shadle wrote:
Yeah I wrote them about it. I said they should sell them and even 
better pair it with their offsite backup service kind of like a 
massive appliance and service option.


They're not selling them but did encourage me to just make a copy of 
it. It looks like the only questionable piece in it is the port 
multipliers. Sil3726 if I recall. Which I think just barely is 
becoming supported in the most recent snvs? That's been something I've 
been wanting forever anyway.


You could also just design your own case that is optimized for a bunch 
of disks, a mobo as long as it has ECC support and enough 
pci/pci-x/pcie slots for the amount of cards to add. You might be able 
to build one without port multipliers and just use a bunch of 8, 12, 
or 16 port sata controllers.


I want to design a case that has two layers - an internal layer with 
all the drives and guts and an external layer that pushes air around 
it to exhaust it quietly and has additional noise dampening...


Sent from my iPhone

On Sep 2, 2009, at 11:01 AM, Al Hopper a...@logical-approach.com 
mailto:a...@logical-approach.com wrote:



Interesting blog:

http://blog.backblaze.com/2009/09/01/petabytes-on-a-budget-how-to-build-cheap-cloud-storage/


Regards,

--
Al Hopper  Logical Approach Inc,Plano,TX a...@logical-approach.com 
mailto:a...@logical-approach.com

  Voice: 972.379.2133 Timezone: US CDT
OpenSolaris Governing Board (OGB) Member - Apr 2005 to Mar 2007
http://www.opensolaris.org/os/community/ogb/ogb_2005-2007/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org mailto:zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Compression/copies on root pool RFE

2009-05-05 Thread Torrey McMahon
Before I put one in ... anyone else seen one? Seems we support 
compression on the root pool but there is no way to enable it at install 
time outside of a custom script you run before the installer. I'm 
thinking it should be a real install time option, have a jumpstart 
keyword, etc.  Same with copies=2


Thanks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS + EMC Cx310 Array (JBOD ? Or Singe MetaLUN ?)

2009-05-01 Thread Torrey McMahon

On 5/1/2009 2:01 PM, Miles Nordin wrote:

I've never heard of using multiple-LUN stripes for storage QoS before.
Have you actually measured some improvement in this configuration over
a single LUN?  If so that's interesting.


Because of the way queing works in the OS and in most array controllers 
you can get better performance in some workloads if you create more LUNs 
from the underlying raid set.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] StorageTek 2540 performance radically changed

2009-04-20 Thread Torrey McMahon

On 4/20/2009 7:26 PM, Robert Milkowski wrote:

Well, you need to disable cache flushes on zfs side then (or make a
firmware change work) and it will make a difference.
   


If you're running recent OpenSolaris/Solaris/SX builds you shouldn't 
have to disable cache flushing on the array. The driver stack should set 
the correct modes.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS vs ZFS + HW raid? Which is best?

2009-01-20 Thread Torrey McMahon
On 1/20/2009 1:14 PM, Richard Elling wrote:
 Orvar Korvar wrote:

 What does this mean? Does that mean that ZFS + HW raid with raid-5 is not 
 able to heal corrupted blocks? Then this is evidence against ZFS + HW raid, 
 and you should only use ZFS?

 http://www.solarisinternals.com/wiki/index.php/ZFS_Best_Practices_Guide

 ZFS works well with storage based protected LUNs (RAID-5 or mirrored LUNs 
 from intelligent storage arrays). However, ZFS cannot heal corrupted blocks 
 that are detected by ZFS checksums.
  


 It means that if ZFS does not manage redundancy, it cannot correct
 bad data.

And there's no rule that says you can't take two array raid volumes, of 
any level, and mirror them with ZFS. (Or a few luns with RZ)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zero page reclaim with ZFS

2008-12-29 Thread Torrey McMahon
Cyril Payet wrote:
 Hello there,
 Hitachi USP-V (sold as 9990V by Sun) provides thin provisioning, 
 known as Hitachi Dynamic Provisioning (HDP).
 This gives a way to make the OS believes that a huge lun is 
 available whilst its size is not physically allocated on the 
 DataSystem side.
 A simple example : 100Gb seen by the OS but only 50Gb physically 
 allocated in the frame, in a physical devices stock (called a HDP-pool)
 The USP-V is now able to reclaim zero pages that are not used by 
 a Filesystem.
 Then, it could put them back to this physical pool, as free many 42Mb 
 blocks.
 As far as I know, when a file is deleted, zfs just stop to reference 
 blocks associated to this file, like MMU does with RAM.
 Blocks are not deleted, nor zeored (sounds very good to get back to 
 some files after a crash !).
 Is there a way to transform - a posteriori or a priori - these 
 unreferenced blocks to zero blocks to make the HDS-Frame able to 
 reclaime these ones ? I know that this will create some overhead...
 It might leads to a smaller block allocation history but could be 
 very usefull for zero-pages-reclaim.
 I do hope that my question was clear enough...
 Thanx for your hints,

There are some mainframe filesystems that do such things. I think there 
was also an STK array - Iceberg[?] - that had similar functionality. 
However, why would you use ZFS on top of HDP? If the filesystem lets you 
grow dynamically, and the OS let's you add storage dynamically or grow 
the LUNs when the array doeswhat does HDP get you?

Serious question as I get asked it all the time and I can't come up with 
a good answer outside of procedural things such as, We don't like to 
bother the storage guys or, We thin provision everything no matter the 
app/fs/os or choose your own adventure.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zero page reclaim with ZFS

2008-12-29 Thread Torrey McMahon
On 12/29/2008 8:20 PM, Tim wrote:


 On Mon, Dec 29, 2008 at 6:09 PM, Torrey McMahon tmcmah...@yahoo.com 
 mailto:tmcmah...@yahoo.com wrote:


 There are some mainframe filesystems that do such things. I think
 there
 was also an STK array - Iceberg[?] - that had similar functionality.
 However, why would you use ZFS on top of HDP? If the filesystem
 lets you
 grow dynamically, and the OS let's you add storage dynamically or grow
 the LUNs when the array doeswhat does HDP get you?

 Serious question as I get asked it all the time and I can't come
 up with
 a good answer outside of procedural things such as, We don't like to
 bother the storage guys or, We thin provision everything no
 matter the
 app/fs/os or choose your own adventure.


 Assign your database admin who swears he needs 2TB day one a 2TB lun.  
 And 6 months from now when he's really only using 200GB, you aren't 
 wasting 1.8TB of disk on him.

I run into the same thing but once I say, I can add more space without 
downtime they tend to smarten up. Also, ZFS will not reuse blocks in a, 
for lack of better words, economical fashion. If you throw them a 2TB 
LUN ZFS will allocate blocks all over the LUN when they're only using a 
small fraction.

Unless you have, as the original poster mentioned, a empty block 
reclaim you'll have problems. UFS can show the same results btw.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Zero page reclaim with ZFS

2008-12-29 Thread Torrey McMahon
On 12/29/2008 10:36 PM, Tim wrote:


 On Mon, Dec 29, 2008 at 8:52 PM, Torrey McMahon tmcmah...@yahoo.com 
 mailto:tmcmah...@yahoo.com wrote:

 On 12/29/2008 8:20 PM, Tim wrote:

 I run into the same thing but once I say, I can add more space
 without downtime they tend to smarten up. Also, ZFS will not
 reuse blocks in a, for lack of better words, economical fashion.
 If you throw them a 2TB LUN ZFS will allocate blocks all over the
 LUN when they're only using a small fraction.

 Unless you have, as the original poster mentioned, a empty block
 reclaim you'll have problems. UFS can show the same results btw.


 I'm not arguing anything towards his specific scenario.  You said you 
 couldn't imagine why anyone would ever want thin provisioning, so I 
 told you why.  Some admins do not have the luxury of trying to debate 
 with other teams they work with as to why they should do things a 
 different way than they want to ;)  That speaks nothing of the change 
 control needed to even get a LUN grown in some shops.

 It's out there, it's being used, it isn't a good fit for zfs.

Right...I called those process issues. Perhaps organizational issues 
would have been better?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450

2008-12-07 Thread Torrey McMahon
I'm pretty sure I understand the importance of a snapshot API. (You take 
the snap, then you do the backup or whatever) My point is that, at 
least on my quick read, you can do most of the same things with the ZFS 
command line utilities. The relevant question would then be how stable 
that is for the type of work we're talking about.

Joseph Zhou wrote:
 Ok, Torrey, I like you, so one more comment before I go to bed --

 Please go study the EMC NetWorker 7.5, and why EMC can claim 
 leadership in VSS support.
 Then, if you still don't understand the importance of VSS, just ask me 
 in an open fashion, I will teach you.

 The importance of storage in system and application optimization can 
 be very significant.
 You do coding, do you know what's TGT from IBM in COBOL, to be able to 
 claim enterprise technology?
 If not, please study.
 http://publib.boulder.ibm.com/infocenter/pdthelp/v1r1/index.jsp?topic=/com.ibm.entcobol.doc_4.1/PGandLR/ref/rpbug10.htm
  


 Open Storage is a great concept, but we can only win with realy 
 advantages, not fake marketing lines.
 I hope everyone enjoyed the discussion. I did.

 zStorageAnalyst


 - Original Message - From: Torrey McMahon [EMAIL PROTECTED]
 To: Joseph Zhou [EMAIL PROTECTED]
 Cc: Richard Elling [EMAIL PROTECTED]; William D. Hathaway 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
 zfs-discuss@opensolaris.org; [EMAIL PROTECTED]
 Sent: Sunday, December 07, 2008 2:40 AM
 Subject: Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun 
 X4150/X4450


 Compared to hw raid only snapshots ZFS is still, imho, easier to use.

 If you start talking about VSS, aka shadow copy for Windows, you're 
 now at the fs level. I can see that VSS offers an API for 3rd parties 
 to use but, as I literally just started reading about it, I'm not an 
 expert. From a quick glance I think the ZFS feature set is 
 comparable. Is there a C++ API to ZFS? Not that I know of. Do you 
 need one? Can't think of a reason off the top of my head given the 
 way the zpool/zfs commands work.

 Joseph Zhou wrote:
 Torrey, now this impressive as the old days with Sun Storage.

 Ok, ZFS PiT is only a software solution.
 The Windows VSS is not only a software solution, but also a 3rd 
 party integration standard from MS.
 What's your comment on ZFS PiT is better than MS PiT, in light of 
 openness and 3rd-party integration???

 Talking about garbage!
 z


 - Original Message - From: Torrey McMahon 
 [EMAIL PROTECTED]
 To: Richard Elling [EMAIL PROTECTED]
 Cc: Joseph Zhou [EMAIL PROTECTED]; William D. 
 Hathaway [EMAIL PROTECTED]; 
 [EMAIL PROTECTED]; zfs-discuss@opensolaris.org; 
 [EMAIL PROTECTED]
 Sent: Sunday, December 07, 2008 1:58 AM
 Subject: Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on 
 Sun X4150/X4450


 Richard Elling wrote:
 Joseph Zhou wrote:

 Yeah?
 http://www.adaptec.com/en-US/products/Controllers/Hardware/sas/value/SAS-31605/_details/Series3_FAQs.htm
  

 Snapshot is a big deal?


 Snapshot is a big deal, but you will find most hardware RAID 
 implementations
 are somewhat limited, as the above adaptec only supports 4 
 snapshots and it is an
 optional feature.  You will find many array vendors will be happy 
 to charge lots
 of money for the snapshot feature.

 On top of that since the ZFS snapshot is at the file system level 
 it's much easier to use. You don't have to quiesce the file system 
 first or hope that when you take the snapshot you get a consistent 
 data set. I've seen plenty of folks take hw raid snapshots without 
 locking the file system first, let alone quiescing the app, and 
 getting garbage.






___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450

2008-12-07 Thread Torrey McMahon
Ian Collins wrote:

  On Mon 08/12/08 08:14 , Torrey McMahon [EMAIL PROTECTED] sent:
   
 I'm pretty sure I understand the importance of a snapshot API. (You take
 the snap, then you do the backup or whatever) My point is that, at 
 least on my quick read, you can do most of the same things with the ZFS
 command line utilities. The relevant question would then be how stable 
 that is for the type of work we're talking about.

 
 Or through the APIs provided by libzfs.

I'm not sure if those are published/supported as opposed to just being 
readable in the source. I think the ADM project is the droid we're 
looking for.

Automatic Data Migration http://opensolaris.org/os/project/adm/
ADM is designed to use the Data Storage Management API (aka XDSM) as
defined in the CAE Specification XDSM as documented by the Open
Group. XDSM provides an Open Standard API to Data Migration
Applications (DMAPI) to manage file backup and recovery, automatic
file migration, and file replication. ADM will take advantage of
these APIs as a privileged application and extension to ZFS. 


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450

2008-12-06 Thread Torrey McMahon
Richard Elling wrote:
 Joseph Zhou wrote:
   
 Yeah?
 http://www.adaptec.com/en-US/products/Controllers/Hardware/sas/value/SAS-31605/_details/Series3_FAQs.htm
 Snapshot is a big deal?
   
 

 Snapshot is a big deal, but you will find most hardware RAID 
 implementations
 are somewhat limited, as the above adaptec only supports 4 snapshots and 
 it is an
 optional feature.  You will find many array vendors will be happy to 
 charge lots
 of money for the snapshot feature.

On top of that since the ZFS snapshot is at the file system level it's 
much easier to use. You don't have to quiesce the file system first or 
hope that when you take the snapshot you get a consistent data set. I've 
seen plenty of folks take hw raid snapshots without locking the file 
system first, let alone quiescing the app, and getting garbage.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun X4150/X4450

2008-12-06 Thread Torrey McMahon
Compared to hw raid only snapshots ZFS is still, imho, easier to use.

If you start talking about VSS, aka shadow copy for Windows, you're now 
at the fs level. I can see that VSS offers an API for 3rd parties to use 
but, as I literally just started reading about it, I'm not an expert. 
 From a quick glance I think the ZFS feature set is comparable. Is there 
a C++ API to ZFS? Not that I know of. Do you need one? Can't think of a 
reason off the top of my head given the way the zpool/zfs commands work.

Joseph Zhou wrote:
 Torrey, now this impressive as the old days with Sun Storage.

 Ok, ZFS PiT is only a software solution.
 The Windows VSS is not only a software solution, but also a 3rd party 
 integration standard from MS.
 What's your comment on ZFS PiT is better than MS PiT, in light of 
 openness and 3rd-party integration???

 Talking about garbage!
 z


 - Original Message - From: Torrey McMahon [EMAIL PROTECTED]
 To: Richard Elling [EMAIL PROTECTED]
 Cc: Joseph Zhou [EMAIL PROTECTED]; William D. Hathaway 
 [EMAIL PROTECTED]; [EMAIL PROTECTED]; 
 zfs-discuss@opensolaris.org; [EMAIL PROTECTED]
 Sent: Sunday, December 07, 2008 1:58 AM
 Subject: Re: [zfs-discuss] Hardware Raid Vs ZFS implementation on Sun 
 X4150/X4450


 Richard Elling wrote:
 Joseph Zhou wrote:

 Yeah?
 http://www.adaptec.com/en-US/products/Controllers/Hardware/sas/value/SAS-31605/_details/Series3_FAQs.htm
  

 Snapshot is a big deal?


 Snapshot is a big deal, but you will find most hardware RAID 
 implementations
 are somewhat limited, as the above adaptec only supports 4 snapshots 
 and it is an
 optional feature.  You will find many array vendors will be happy to 
 charge lots
 of money for the snapshot feature.

 On top of that since the ZFS snapshot is at the file system level 
 it's much easier to use. You don't have to quiesce the file system 
 first or hope that when you take the snapshot you get a consistent 
 data set. I've seen plenty of folks take hw raid snapshots without 
 locking the file system first, let alone quiescing the app, and 
 getting garbage. 



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Tuning ZFS for Sun Java Messaging Server

2008-10-24 Thread Torrey McMahon
You may want to ask your SAN vendor if they have a setting you can make 
to no-op the cache flush. That way you don't have to worry about the 
flush behavior if you change/add different arrays.

Adam N. Copeland wrote:
 Thanks for the replies.

 It appears the problem is that we are I/O bound. We have our SAN guy
 looking into possibly moving us to faster spindles. In the meantime, I
 wanted to implement whatever was possible to give us breathing room.
 Turning off atime certainly helped, but we are definitely not completely
 out of the drink yet.

 I also found that disabling the ZFS cache flush as per the Evil Tuning
 Guide was a huge boon, considering we're on a battery-backed (non-Sun) SAN.

 Thanks,
 Adam

 Richard Elling wrote:
   
 As it happens, I'm currently involved with a project doing some
 performance
 analysis for this... but it is currently a WIP.  Comments below.

 Robert Milkowski wrote:
 
 Hello Adam,

 Tuesday, October 21, 2008, 2:00:46 PM, you wrote:

 ANC We're using a rather large (3.8TB) ZFS volume for our mailstores
 on a
 ANC JMS setup. Does anybody have any tips for tuning ZFS for JMS? I'm
 ANC looking for even the most obvious tips, as I am a bit of a
 novice. Thanks,

 Well, it's kind of broad topic and it depends on a specific
 environment. Then do not tune for the sake of tuning - try to
 understand your problem first. Nevertheless you should consider
 things like (random order):

 1. RAID level - you probably will end-up with relatively small random
IOs - generally avoid RAID-Z
Of course it could be that RAID-Z in your environment is perfectly
fine.
   
   
 There are some write latency-sensitive areas that will begin
 to cause consternation for large loads.  Storage tuning is very
 important in this space.  In our case, we're using a ST6540
 array which has a decent write cache and fast back-end.

 
 2. Depending on your workload and disk subsystem ZFS's slog on SSD
 could help to improve performance
   
   
 My experiments show that this is not the main performance
 issue for large message volumes.

 
 3. Disable atime updates on zfs file system
   
   
 Agree.  JMS doesn't use it, so it just means extra work.

 
 4. Enabling compression like lzjb in theory could help - depends on
 how weel you data would compress and how much CPU you have left and if
 you are mostly IO bond
   
   
 We have not experimented with this yet, but know that some
 of the latency-sensitive writes are files with a small number of
 bytes, which will not compress to be less than one disk block.
 [opportunities for cleverness are here :-)]

 There may be a benefit for the message body, but in my tests
 we are not concentrating on that at this time.

 
 5. ZFS recordsize - probably not as in most cases when you read
 anything from email you will probably read entire mail anyway.
 Nevertheless could be easily checked with dtrace.
   
   
 This does not seem to be an issue.

 
 6. IIRC JMS keeps an index/db file per mailbox - so just maybe L2ARC
 on large SSD would help assuming it would nicely cache these files -
 would need to be simulated/tested
   
   
 This does not seem to be an issue, but in our testing the message
 stores have plenty of memory, and hence, ARC size is on the order
 of tens of GBytes.

 
 7. Disabling vdev pre-fetching in ZFS could help - see ZFS Evile tuning
 guide
   
   
 My experiments showed no benefit by disabling pre-fetch.  However,
 there are multiple layers of pre-fetching at play when you are using an
 array, and we haven't done a complete analysis on this yet.  It is clear
 that we are not bandwidth limited, so prefetching may not hurt.

 
 Except for #3 and maybe #7 first identify what is your problem and
 what are you trying to fix.

   
   
 Yep.
 -- richard

 
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Tuning ZFS for Sun Java Messaging Server

2008-10-24 Thread Torrey McMahon
Richard Elling wrote:
 Adam N. Copeland wrote:
   
 Thanks for the replies.

 It appears the problem is that we are I/O bound. We have our SAN guy
 looking into possibly moving us to faster spindles. In the meantime, I
 wanted to implement whatever was possible to give us breathing room.
 Turning off atime certainly helped, but we are definitely not completely
 out of the drink yet.

 I also found that disabling the ZFS cache flush as per the Evil Tuning
 Guide was a huge boon, considering we're on a battery-backed (non-Sun) SAN.
   
 

 Really?  Which OS version are you on?  This should have been
 fixed in Solaris 10 5/08 (it is a fix in the [s]sd driver).  Caveat: there
 may be some devices which do not properly negotiate the SYNC_NV
 bit.  In my tests, using Solaris 10 5/08, disabling the cache flush made
 zero difference.
   

PSARC 2007/053

If I read through the code correctly...

If the array doesn't respond to the device inquiry, you haven't made an 
entry to sd.conf for the array, or it isn't hard coded in the sd.c table 
- I think there are only two in that state - then you'd have to disable 
the cache flush.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] X4540

2008-07-10 Thread Torrey McMahon
Spencer Shepler wrote:
 On Jul 10, 2008, at 7:05 AM, Ross wrote:

   
 Oh god, I hope not.  A patent on fitting a card in a PCI-E slot, or  
 using nvram with RAID (which raid controllers have been doing for  
 years) would just be rediculous.  This is nothing more than cache,  
 and even with the American patent system I'd have though it hard to  
 get that past the obviousness test.
 

 How quickly they forget.

 Take a look at the Prestoserve User's Guide for a refresher...

 http://docs.sun.com/app/docs/doc/801-4896-11

Or Fast Write Cache

http://docs.sun.com/app/docs/coll/fast-write-cache2.0
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS Deferred Frees

2008-06-16 Thread Torrey McMahon
I'm doing some simple testing of ZFS block reuse and was wondering when 
deferred frees kick in. Is it on some sort of timer to ensure data 
consistency? Does an other routine call it? Would something as simple as 
sync(1M) get the free block list written out so future allocations could 
use the space?

... or am I way off in the weeds? :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS conflict with MAID?

2008-06-11 Thread Torrey McMahon
A Darren Dunham wrote:
 On Tue, Jun 10, 2008 at 05:32:21PM -0400, Torrey McMahon wrote:
   
 However, some apps will probably be very unhappy if i/o takes 60 seconds 
 to complete.
 

 It's certainly not uncommon for that to occur in an NFS environment.
 All of our applications seem to hang on just fine for minor planned and
 unplanned outages.

 Would the apps behave differently in this case?  (I'm certainly not
 thinking of a production database for such a configuration).

Some applications have their own internal timers that track i/o time 
and, if it doesn't complete in time, will error out. I don't know which 
part of the stack the timer was in but I've seen an Oracle RAC cluster 
on QFS timeout much faster then the SCSI retries normally allow for. (I 
think it was Oracle in that case...)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS conflict with MAID?

2008-06-10 Thread Torrey McMahon
Richard Elling wrote:
 Tobias Exner wrote:
   
 Hi John,

 I've done some tests with a SUN X4500 with zfs and MAID using the 
 powerd of Solaris 10 to power down the disks which weren't access for 
 a configured time. It's working fine...

 The only thing I run into was the problem that it took roundabout a 
 minute to power on 4 disks in a zfs-pool. The problem seems to be that 
 the powerd starts the disks sequentially.
 

 Did you power down disks or spin down disks?  It is relatively
 easy to spin down (or up) disks with luxadm stop (start).  If a
 disk is accessed, then it will spin itself up.  By default, the timeout
 for disk response is 60 seconds, and most disks can spin up in
 less than 60 seconds.

However, some apps will probably be very unhappy if i/o takes 60 seconds 
to complete.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions?

2008-05-19 Thread Torrey McMahon
The release should be out any day now. I think its being pushed to the 
external download site whilst we type/read.

Andy Lubel wrote:
 The limitation existed in every Sun branded Engenio array we tested - 
 2510,2530,2540,6130,6540.  This limitation is on volumes.  You will not be 
 able to present a lun larger than that magical 1.998TB.  I think it is a 
 combination of both in CAM and the firmware.  Can't do it with sscs either...
  
 Warm and fuzzy:  Sun engineers told me they would have a new release of CAM 
 (and firmware bundle) in late June which would resolve this limitation.
  
 Or just do ZFS (or even SVM) setup like Bob and I did.  Its actually pretty 
 nice because the traffic will split to both controllers giving you 
 theoretically more throughput so long as MPxIO is functioning properly.  Only 
 (minor) downside is parity is being transmitted from the host to the disks 
 rather than living on the controller entirely.
  
 -Andy
  
 

 From: [EMAIL PROTECTED] on behalf of Torrey McMahon
 Sent: Mon 5/19/2008 1:59 PM
 To: Bob Friesenhahn
 Cc: zfs-discuss@opensolaris.org; Kenny
 Subject: Re: [zfs-discuss] ZFS and Sun Disk arrays - Opinions?



 Bob Friesenhahn wrote:
   
 On Mon, 19 May 2008, Kenny wrote:

  
 
 Bob M.- Thanks for the heads up on the 2 (1.998) TN Lun limit.
 This has me a little concerned esp. since I have 1 TB drives being
 delivered! Also thanks for the scsi cache flushing heads up, yet
 another item to lookup!  grin

   
 I am not sure if this LUN size limit really exists, or if it exists,
 in which cases it actually applies.  On my drive array, I created a
 3.6GB RAID-0 pool with all 12 drives included during the testing
 process.  Unfortunately, I don't recall if I created a LUN using all
 the space.

 I don't recall ever seeing mention of a 2TB limit in the CAM user
 interface or in the documentation.
 

 The Solaris LUN limit is gone if you're using Solaris 10 and recent patches.
 The array limit(s) are tied to the type of array you're using. (Which
 type is this again?)
 CAM shouldn't be enforcing any limits of its own but only reporting back
 when the array complains.
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Backup-ing up ZFS configurations

2008-03-21 Thread Torrey McMahon
eric kustarz wrote:
 So even with the above, if you add a vdev, slog, or l2arc later on,  
 that can be lost via the history being a ring buffer.  There's a RFE  
 for essentially taking your current 'zpool status' output and  
 outputting a config (one that could be used to create a brand new pool):
 6276640 zpool config

I'm surprised there haven't been more hands raised for this one. It 
would be very handy for a change management process, setting up DR 
sites, testing, etc.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Round-robin NFS protocol with ZFS

2008-03-13 Thread Torrey McMahon
Tim wrote:




 He wants to mount the ZFS filesystem (I'm assuming off of a backend 
 SAN storage array) to two heads, then round-robin NFS connections 
 between the heads to essentially *double* the throughput.

pNFS is the droid you are looking for.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] SunMC module for ZFS

2008-02-15 Thread Torrey McMahon
Anyone have a pointer to a general ZFS health/monitoring module for 
SunMC? There isn't one baked into SunMC proper which means I get to 
write one myself if someone hasn't already done it.

Thanks.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Case #65841812

2008-02-02 Thread Torrey McMahon
I'm not an Oracle expert but I don't think Oracle checksumming can 
correct data. If you have ZFS checksums enabled, and you're mirroring in 
your zpools, then ZFS can self-correct as long the checksum on the other 
half of the mirror is good.

Mertol Ozyoney wrote:
 Don't take my words as an expert advice, as I am newbie when it comes to
 ZFS. 

 If I am not mistaken, if you are only using Oracle on the particular Zpol,
 Oracle Checksum offers better protection against data corruption. 
 You can disable ZFS checksums. 

 Best regards
 Mertol


 Mertol Ozyoney 
 Storage Practice - Sales Manager

 Sun Microsystems, TR
 Istanbul TR
 Phone +902123352200
 Mobile +905339310752
 Fax +90212335
 Email [EMAIL PROTECTED]



 -Original Message-
 From: [EMAIL PROTECTED]
 [mailto:[EMAIL PROTECTED] On Behalf Of Scott Macdonald -
 Sun Microsystem
 Sent: 01 Şubat 2008 Cuma 15:31
 To: zfs-discuss@opensolaris.org; [EMAIL PROTECTED]
 Subject: [zfs-discuss] Case #65841812

 Below is my customers issue. I am stuck on this one. I would appreciate 
 if someone could help me out on this. Thanks in advance!



 ZFS Checksum feature:
  
 I/O checksum is one of the main ZFS features; however, there is also 
 block checksum done by Oracle. This is 
 good when utilizing UFS since it does not do checksums, but with ZFS it 
 can be a waste of CPU time.
 Suggestions have been made to change the Oracle db_block_checksum 
 parameter to false which may give 
 Significant performance gain on ZFS.
  
 What are Sun's stance and/or suggestions on making this change on the 
 ZFS side as well as making the changes on the Oracle side.

   


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Hardware RAID vs. ZFS RAID

2008-01-31 Thread Torrey McMahon
Kyle McDonald wrote:
 Vincent Fox wrote:
   
 So the point is,  a JBOD with a flash drive in one (or two to mirror the 
 ZIL) of the slots would be a lot SIMPLER.

 We've all spent the last decade or two offloading functions into specialized 
 hardware, that has turned into these massive unneccessarily complex things.

 I don't want to go to a new training class everytime we buy a new model of 
 storage unit.  I don't want to have to setup a new server on my private 
 network to run the Java GUI management software for that array and all the 
 other BS that array vendors put us through.

 I just want storage.
  
   
 
 Good Point.

You still need interfaces, of some kind, to manage the device. Temp 
sensors? Drive fru information? All that information has to go out, and 
some in, over an interface of some sort.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS under VMware

2008-01-30 Thread Torrey McMahon
Lewis Thompson wrote:
 Hello,

 I'm planning to use VMware Server on Ubuntu to host multiple VMs, one
 of which will be a Solaris instance for the purposes of ZFS
 I would give the ZFS VM two physical disks for my zpool, e.g. /dev/sda
 and /dev/sdb, in addition to the VMware virtual disk for the Solaris
 OS

 Now I know that Solaris/ZFS likes to have total control over the disks
 to ensure writes are flushed as and when it is ready for them to
 happen, so I wonder if anybody comment on what implications using the
 disks in this way (i.e. through Linux and then VMware) has on the
 control Solaris has over these disks?  By using a VM will I be missing
 out in terms of reliability?  If so, can anybody suggest any
 improvements I could make while still allowing Solaris/ZFS to run in a
 VM?

I'm not sure what the perf aspects would be but it depends on what the 
VMware software passes through. Does it ignore cache sync commands in 
its i/o stack? Got me.

You won't be missing out on reliability but you will be introducing more 
layers in the stack where something could go wrong.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] NFS performance on ZFS vs UFS

2008-01-25 Thread Torrey McMahon
Robert Milkowski wrote:
 Hello Darren,



 DJM BTW there isn't really any such think as disk corruption there is 
 DJM data corruption :-)

 Well, if you scratch it hard enough :)
   

http://www.philohome.com/hammerhead/broken-disk.jpg :-)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] iscsi on zvol

2008-01-24 Thread Torrey McMahon
Jim Dunham wrote:

 This raises a key point that that you should be aware of. ZFS does not  
 support shared access to the same ZFS filesystem.


 unless you put NFS or something on top of it.

(I always forget that part myself.)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS via Virtualized Solaris?

2008-01-07 Thread Torrey McMahon
Peter Schuller wrote:
 From what I read, one of the main things about ZFS is Don't trust the
   
 underlying hardware.  If this is the case, could I run Solaris under
 VirtualBox or under some other emulated environment and still get the
 benefits of ZFS such as end to end data integrity?
 
 You could probably answer that question by changing the phrase to Don't
 trust the underlying virtual hardware!  ZFS doesn't care if the storage is
 virtualised or not.
 

 But worth noting is that, as with for example hardware RAID, if you intend to 
 take advantage of the self-healing properties of ZFS with multiple disks, you 
 must expose the individual disks to your mirror/raidz/raidz2 individually 
 through the virtualization environment and use them in your pool.

Or expose enough LUNs to take advantage of it. Two raid LUNs in a mirror 
for example.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Does Oracle support ZFS as a file system with Oracle RAC?

2007-12-23 Thread Torrey McMahon
Louwtjie Burger wrote:
 On 12/19/07, David Magda [EMAIL PROTECTED] wrote:
   
 On Dec 18, 2007, at 12:23, Mike Gerdts wrote:

 
 2) Database files - I'll lump redo logs, etc. in with this.  In Oracle
RAC these must live on a shared-rw (e.g. clustered VxFS, NFS) file
system.  ZFS does not do this.
   
 If you can use NFS, can't you put things on ZFS and then export?
 

 Is it a good idea to put a oracle database on the other end of a NFS
 mount ? (performance wise)

Depends on the characteristics of your network and what amount of 
performance you need. (As with most things it depends...)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] [storage-discuss] SAN arrays with NVRAM cache : ZIL and zfs_nocacheflush

2007-11-27 Thread Torrey McMahon
Nicolas Dorfsman wrote:
 Le 27 nov. 07 à 16:17, Torrey McMahon a écrit :
   
 According to the array vendor the 99xx arrays no-op the cache flush  
 command. No need to set the /etc/system flag.

 http://blogs.sun.com/torrey/entry/zfs_and_99xx_storage_arrays

 


 Perfect !

 Thanks Torrey.

   

Just realize that the HDS midrange, which Sun does not resell, are 
different beasts then the 99xx line from Sun.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sun's storage product roadmap?

2007-10-19 Thread Torrey McMahon
The profit stuff has been NDA for awhile but we started telling the 
street a while back and they seem to like the idea. :)

Selim Daoud wrote:
 wasn't that an NDA info??

 s-

 On 10/18/07, Torrey McMahon [EMAIL PROTECTED] wrote:
   
 MC wrote:
 
 Sun's storage strategy:

 1) Finish Indiana and distro constructor
 2) (ship stuff using ZFS-Indiana)
 3) Success
   
 4) Profit :)
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

 

   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Sun's storage product roadmap?

2007-10-18 Thread Torrey McMahon
MC wrote:
 Sun's storage strategy:

 1) Finish Indiana and distro constructor
 2) (ship stuff using ZFS-Indiana)
 3) Success

4) Profit :)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] The ZFS-Man.

2007-09-21 Thread Torrey McMahon
Jonathan Edwards wrote:
 On Sep 21, 2007, at 14:57, eric kustarz wrote:

   
 Hi.

 I gave a talk about ZFS during EuroBSDCon 2007, and because it won  
 the
 the best talk award and some find it funny, here it is:

 http://youtube.com/watch?v=o3TGM0T1CvE

 a bit better version is here:

 http://people.freebsd.org/~pjd/misc/zfs/zfs-man.swf
   
 Looks like Jeff has been working out :)
 

 my first thought too:
 http://blogs.sun.com/bonwick/resource/images/bonwick.portrait.jpg

 funny - i always pictured this as UFS-man though:
 http://www.benbakerphoto.com/business/47573_8C-after.jpg

 but what's going on with the sheep there?

Got me but they do look kind of nervous. (Happy friday folks...)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Solaris 10 Update 4 Patches

2007-09-20 Thread Torrey McMahon
Did you upgrade your pools? zpool upgrade -a

John-Paul Drawneek wrote:
 err, I installed the patch and am still on zfs 3?

 solaris 10 u3 with kernel patch 120011-14
  
  
 This message posted from opensolaris.org
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

   

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Mirrored zpool across network

2007-08-19 Thread Torrey McMahon
Mark wrote:
 Hi All,

 Im just wondering (i figure you can do this but dont know what hardware and 
 stuff i would need) if I can set up a mirror of a raidz zpool across a 
 network.

 Basically, the setup is a large volume of Hi-Def video is being streamed from 
 a camera, onto an editing timeline. This will be written to a network share. 
 Due to the large amounts of data, ZFS is a really good option for us. But we 
 need a backup. We need to do it on generic hardware (i was thinking AMD64 
 with an array of large 7200rpm hard drives), and therefore i think im going 
 to have one box mirroring the other box. They will be connected by gigabit 
 ethernet. So my question is how do I mirror one raidz Array across the 
 network to the other?

rsync?
zfs send/rcv?
AVS?
iSCSI targets on the two boxes?

Lots of ways to do it. Depends what your definition of backup is. Time 
based? Extra redundancy?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Snapshots and worm devices

2007-08-14 Thread Torrey McMahon
Has anyone thought about using snapshots and WORM devices. In theory, 
you'd have to keep the WORM drive out of the pool, or as a special 
device, and it would have to be a full snapshot even though we really 
don't have those.

Any plans in this area? I could take a snapshot, clone it, then copy it 
to the worm device with cpio or friends but that adds time and 
possibility of error(s).

Thanks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and powerpath

2007-07-16 Thread Torrey McMahon
Carisdad wrote:
 Peter Tribble wrote:
   
 # powermt display dev=all
 Pseudo name=emcpower0a
 CLARiiON ID=APM00043600837 []
 Logical device ID=600601600C4912003AB4B247BA2BDA11 [LUN 46]
 state=alive; policy=CLAROpt; priority=0; queued-IOs=0
 Owner: default=SP B, current=SP B
 ==
  Host ---   - Stor -   -- I/O Path -  -- Stats 
 ---
 ###  HW PathI/O PathsInterf.   ModeState  Q-IOs 
 Errors
 ==
 3073 [EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 
 c2t500601613060099Cd1s0 SP A1
 active  alive  0  0
 3073 [EMAIL PROTECTED],60/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 
 c2t500601693060099Cd1s0 SP B1
 active  alive  0  0
 3072 [EMAIL PROTECTED],70/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 
 c3t500601603060099Cd1s0 SP A0
 active  alive  0  0
 3072 [EMAIL PROTECTED],70/[EMAIL PROTECTED]/[EMAIL PROTECTED],0 
 c3t500601683060099Cd1s0 SP B0
 active  alive  0  0
   

 
 If it helps at all.  We're having a similar problem.  Any LUN's 
 configured with their default owner to be SP B, don't get along with 
 ZFS.   We're running on a T2000, With Emulex cards and the ssd driver.  
 MPXIO seems to work well for most cases, but the SAN guys are not 
 comfortable with it.

Are you using the top level powerpath device? Is the clariion in an 
auto-trespass mode where any i/o going down the alt path will cause the 
LUNs to move?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and powerpath

2007-07-16 Thread Torrey McMahon
Darren Dunham wrote:
 If it helps at all.  We're having a similar problem.  Any LUN's 
 configured with their default owner to be SP B, don't get along with 
 ZFS.   We're running on a T2000, With Emulex cards and the ssd driver.  
 MPXIO seems to work well for most cases, but the SAN guys are not 
 comfortable with it.
   
 Are you using the top level powerpath device? Is the clariion in an 
 auto-trespass mode where any i/o going down the alt path will cause the 
 LUNs to move?
 

 My previous experience with powerpath was that it rode below the Solaris
 device layer.  So you couldn't cause trespass by using the wrong
 device.  It would just go to powerpath which would choose the link to
 use on its own.

 Is this not true or has it changed over time?
   

I haven't looked at power path for some time but it used to be the 
opposite. The powerpath node sat on top of the actual device paths. One 
of the selling points of mpxio is that it doesn't have that problem. (At 
least for devices it supports.) Most of the multipath software had that 
same limitation

However, I'm not an expert on powerpath by any stretch of the 
imagination. I just took a quick look at the powerpath manual (4.0 
version.) and it says you can now use both types which seems a little 
confusing. Again, I's be interested to see if using the pseudo-device 
works better.not to mention how it works using the direct path disk 
entry.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how to remove sun volume mgr configuration?

2007-07-16 Thread Torrey McMahon
Bill Sommerfeld wrote:
 On Mon, 2007-07-16 at 18:19 -0700, Russ Petruzzelli wrote:
   
 Or am I just getting myself into shark infested waters?
 

 configurations that might be interesting to play with:
 (emphasis here on play...)

  1) use the T3's management CLI to reconfigure the T3 into two raid-0
 volumes, and mirror them with ZFS.  

  2) if you have some JBODs available as well, use the T3 (which has a
 modest-sized battery backed write cache in the controller) as a separate
 log device (that's a new feature introduced in a recent nevada build).

Has the project that lets you specify an array has having a battery back 
up gone in yet? If not then wouldn't the sync cache problem be in play? 
I don't know if the T3 honors cache flush commands or set the i've got 
a stable cache bit in the relevant scsi mode page.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] how to remove sun volume mgr configuration?

2007-07-16 Thread Torrey McMahon
James C. McPherson wrote:


 The T3B with fw v3.x (I think) and the T4 (aka 6020 tray) allow
 more than two volumes, but you're still quite restricted in what
 you can do with them.
   

You are limited to two raid groups with slices on top of those raid 
groups presented as LUNs. I'd just stick with the raid groups and not 
try to get really overboard and use slices because you can just 

   
 [SNIP]

   
 You can use ZFS on that volume, but it will have no redundancy at the
 ZFS level, only at the disk level controlled by the T3.
 

 Well ... you could create two volumes on the array and mirror
 those using ZFS  Some might say that's a waste of space :)


... stick to R0 and then mirror with ZFS? At least T3 will let you do 
that as opposed to other storage arrays that let you pick from R1 and R5 
only.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and powerpath

2007-07-13 Thread Torrey McMahon
Peter Tribble wrote:
 On 7/13/07, Alderman, Sean [EMAIL PROTECTED] wrote:
   
 I wonder what kind of card Peter's using and if there is a potential
 linkage there.  We've got the Sun branded Emulux cards in our sparcs.  I
 also wonder if Peter were able to allocate an additional LUN to his
 system whether or not he'd be able to create a pool on that new LUN.
 

 On a different continent and I didn't buy it. Shows up as lpfc (is
 that Emulex?). I'm not sure that's related - I can see the LUNs
 and devices, it's just that zfs isn't happy.

Those, lpfc, are native Emulex drivers.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and powerpath

2007-07-13 Thread Torrey McMahon
[EMAIL PROTECTED] wrote:



 [EMAIL PROTECTED] wrote on 07/13/2007 02:21:52 PM:

   
 Peter Tribble wrote:

 
 I've not got that far. During an import, ZFS just pokes around - there
 doesn't seem to be an explicit way to tell it which particular devices
 or SAN paths to use.
   
 You can't tell it which devices to use in a straightforward manner. But
 you can tell it which directories to scan.

 zpool import [-d dir]

 By default, it scans /dev/dsk.

 Does truss of zfs import show the powerrpath devices being opened and
 read from?
 


 AFAIK powerpath does not really need to use the powerpath pseudo devices --
 they are just there for convenience.  I would expect the drives to be
 readable from either the c1 devices or emc*.

ZFS needs to use the top level multipath device or bad things will 
probably happen in a failover or in initial zpool creation. Fopr 
example: You'll try to use the device on two paths and cause a lun 
failover to occur.

Mpxio fixes a lot of these issues. I strongly suggest using mpxio 
instead of powerpath but sometimes its all you can use if the array is 
new and mpxio doesn't have the hooks for it ... yet.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Plans for swapping to part of a pool

2007-07-12 Thread Torrey McMahon
I really don't want to bring this up but ...

Why do we still tell people to use swap volumes? Would we have the same 
sort of issue with the dump device so we need to fix it anyway?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] How to take advantage of PSARC 2007/171: ZFS Separate Intent Log

2007-07-08 Thread Torrey McMahon
Bryan Cantrill wrote:
 On Tue, Jul 03, 2007 at 10:26:20AM -0500, Albert Chin wrote:
   
 PSARC 2007/171 will be available in b68. Any documentation anywhere on
 how to take advantage of it?

 Some of the Sun storage arrays contain NVRAM. It would be really nice
 if the array NVRAM would be available for ZIL storage. 
 

 It depends on your array, of course, but in most arrays you can control
 the amount of write cache (i.e., NVRAM) dedicated to particular LUNs.
 So to use the new separate logging most effectively, you should take
 your array, and dedicate all of your NVRAM to a single LUN that you then
 use as your separate log device.  Your pool should then use a LUN or LUNs
 that do not have any NVRAM dedicated to it.  

On some of the new Sun midrange arrays you can disable cache to a LUN 
but I've never seen hooks that let you dedicate a certain amount of 
cache to one LUN in particular. (None of the older midrange arrays let 
you do this.) Some of the high end arrays allow you to pin some data in 
cache like the 9990.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs space efficiency

2007-06-24 Thread Torrey McMahon
The interesting collision is going to be file system level encryption 
vs. de-duplication as the former makes the latter pretty difficult.


dave johnson wrote:
How other storage systems do it is by calculating a hash value for 
said file (or block), storing that value in a db, then checking every 
new file (or block) commit against the db for a match and if found, 
replace file (or block) with duplicate entry in db.


The most common non-proprietary hash calc for file-level deduplication 
seems to be the combination of the SHA1 and MD5 together.  Collisions 
have been shown to exist in MD5 and theoried to exist in SHA1 by 
extrapolation, but the probibility of collitions occuring 
simultaneously both is to small as the capacity of ZFS is to large :)


While computationally intense, this would be a VERY welcome feature 
addition to ZFS and given the existing infrastructure within the 
filesystem already, while non-trivial by any means, it seems a prime 
candidate.  I am not a programmer so I do not have the expertise to 
spearhead such a movement but I would think getting at least a 
placeholder Goals and Objectives page into the OZFS community pages 
would be a good start even if movement on this doesn't come for a year 
or more.


Thoughts ?

-=dave

- Original Message - From: Gary Mills [EMAIL PROTECTED]
To: Erik Trimble [EMAIL PROTECTED]
Cc: Matthew Ahrens [EMAIL PROTECTED]; roland 
[EMAIL PROTECTED]; zfs-discuss@opensolaris.org

Sent: Sunday, June 24, 2007 3:58 PM
Subject: Re: [zfs-discuss] zfs space efficiency



On Sun, Jun 24, 2007 at 03:39:40PM -0700, Erik Trimble wrote:

Matthew Ahrens wrote:
Will Murnane wrote:
On 6/23/07, Erik Trimble [EMAIL PROTECTED] wrote:
Now, wouldn't it be nice to have syscalls which would implement cp
and
mv, thus abstracting it away from the userland app?



A copyfile primitive would be great!  It would solve the problem of
having all those friends to deal with -- stat(), extended
attributes, UFS ACLs, NFSv4 ACLs, CIFS attributes, etc.  That isn't to
say that it would have to be implemented in the kernel; it could
easily be a library function.

I'm with Matt.  Having a copyfile library/sys call would be of
significant advantage.  In this case, we can't currently take advantage
of the CoW ability of ZFS when doing 'cp A B'  (as has been pointed out
to me).  'cp' simply opens file A with read(), opens a new file B with
write(), and then shuffles the data between the two.  Now, if we had a
copyfile(A,B) primitive, then the 'cp' binary would simply call this
function, and, depending on the underlying FS, it would get implemented
differently.  In UFS, it would work as it does now. For ZFS, it would
work like a snapshot, where file A and B share data blocks (at least
until someone starts to update either A or B).


Isn't this technique an instance of `deduplication', which seems to be
a hot idea in storage these days?  I wonder if it could be done
automatically, behind the scenes, in some fashion.

--
-Gary Mills--Unix Support--U of M Academic Computing and 
Networking-

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-24 Thread Torrey McMahon

Gary Mills wrote:

On Wed, Jun 20, 2007 at 12:23:18PM -0400, Torrey McMahon wrote:
  

James C. McPherson wrote:


Roshan Perera wrote:
  

But Roshan, if your pool is not replicated from ZFS' point of view,
then all the multipathing and raid controller backup in the world will
not make a difference.
  

James, I Agree from ZFS point of view. However, from the EMC or the
customer point of view they want to do the replication at the EMC level
and not from ZFS. By replicating at the ZFS level they will loose some
storage and its doubling the replication. Its just customer use to
working with Veritas and UFS and they don't want to change their 
habbits.

I just have to convince the customer to use ZFS replication.


that's a great shame because if they actually want
to make use of the features of ZFS such as replication,
then they need to be serious about configuring their
storage to play in the ZFS world and that means
replication that ZFS knows about.
  
Also, how does replication at the ZFS level use more storage - I'm 
assuming raw block - then at the array level?



SAN storage generally doesn't work that way.  They use some magical
redundancy scheme, which may be RAID-5 or WAFL, from which the Storage
Administrator carves out virtual disks.  These are best viewed as an
array of blocks.  All disk administration, such as replacing failed
disks, takes place on the storage device without affecting the virtual
disks.  There's no need for disk administration or additional
redundancy on the client side.  If more space is needed on the client,
the Storage Administrator simply expands the virtual disk by extending
its blocks.  ZFS needs to play nicely in this environment because
that's what's available in large organizations that have centralized
their storage.  Asking for raw disks doesn't work.
  


Are we talking about replication - I have a copy of my data on an other 
system - or redundancy - I have a system where I can tolerate a local 
failure?


...and I understand the ZFS has to play nice with HW raid argument. :)


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-24 Thread Torrey McMahon

Victor Engle wrote:

On 6/20/07, Torrey McMahon [EMAIL PROTECTED] wrote:
Also, how does replication at the ZFS level use more storage - I'm
assuming raw block - then at the array level?
___



Just to add to the previous comments. In the case where you have a SAN
array providing storage to a host for use with ZFS the SAN storage
really needs to be redundant in the array AND the zpools need to be
redundant pools.

The reason the SAN storage should be redundant is that SAN arrays are
designed to serve logical units. The logical units are usually
allocated from a raid set, storage pool or aggregate of some kind. The
array side pool/aggregate may include 10 300GB disks and may have 100+
luns allocated from it for example. If redundancy is not used in the
array side pool/aggregate and then 1 disk failure will kill 100+ luns
at once. 


That makes a lot of sense in configurations where an array is exporting 
LUNs built on raid volumes to a set of heterogeneous hosts. If you're 
direct connected to a single box running ZFS or a set of boxes running 
ZFS you probably want to export something as close to the raw disks as 
possible while maintaining ZFS level redundancy. (Like two R5 LUNs in a 
ZFS mirror.) Creating a raid set, carving out lots of LUNs and then 
handing them all over to ZFS isn't going to buy you a lot and could 
cause performance issues. (LUN skew for example.)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - SAN and Raid

2007-06-20 Thread Torrey McMahon

James C. McPherson wrote:

Roshan Perera wrote:



But Roshan, if your pool is not replicated from ZFS' point of view,
then all the multipathing and raid controller backup in the world will
not make a difference.


James, I Agree from ZFS point of view. However, from the EMC or the
customer point of view they want to do the replication at the EMC level
and not from ZFS. By replicating at the ZFS level they will loose some
storage and its doubling the replication. Its just customer use to
working with Veritas and UFS and they don't want to change their 
habbits.

I just have to convince the customer to use ZFS replication.


Hi Roshan,
that's a great shame because if they actually want
to make use of the features of ZFS such as replication,
then they need to be serious about configuring their
storage to play in the ZFS world and that means
replication that ZFS knows about.



Also, how does replication at the ZFS level use more storage - I'm 
assuming raw block - then at the array level?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs and EMC

2007-06-15 Thread Torrey McMahon
This sounds familiarlike something about the powerpath device not 
responding to the SCSI inquiry strings. Are you using the same version 
of powerpath on both systems? Same type of array on both?


Dominik Saar wrote:

Hi there,

have a strange behavior if i´ll create a zfs pool at an EMC PowerPath
pseudo device.

I can create a pool on emcpower0a
but not on emcpower2a

zpool core dumps with invalid argument  

Thats my second maschine with powerpath and zfs
the first one works fine, even zfs/powerpath and failover ...

Is there anybody who has the same failure and a solution ? :)

Greets

Dominik



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

  



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] IRC: thought: irc.freenode.net #zfs for platform-agnostic or multi-platform discussion

2007-06-08 Thread Torrey McMahon

Graham Perrin wrote:

We have irc://irc.freenode.net/solaris
and irc://irc.freenode.net/opensolaris
and the other channels listed at
http://blogs.sun.com/jimgris/entry/opensolaris_on_irc

AND growing discussion of ZFS in Mac- 'FUSE- and Linux-oriented channels

BUT unless I'm missing something, no IRC channel for ZFS.

Please:

* which IRC channel will be best for discussion of ZFS from a
  multi-platform or platform-agnostic viewpoint? 


#zfs though it looks like there aren't many people on at the 
momentand maybe someone had the same idea I did and just opened it.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-25 Thread Torrey McMahon

Toby Thain wrote:


On 25-May-07, at 1:22 AM, Torrey McMahon wrote:


Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is 
your service subscription? Maybe only scrapyards and museums will 
have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees 
that RAID subsystems cannot.


Depend on the guarantees. Some RAID systems have built in block 
checksumming.




Which still isn't the same. Sigh. 


Yep.you get what you pay for. Funny how ZFS is free to purchase 
isn't it?


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Torrey McMahon

Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is your 
service subscription? Maybe only scrapyards and museums will have 
what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees that 
RAID subsystems cannot. 


Depend on the guarantees. Some RAID systems have built in block 
checksumming.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] No zfs_nocacheflush in Solaris 10?

2007-05-24 Thread Torrey McMahon

Albert Chin wrote:

On Thu, May 24, 2007 at 11:55:58AM -0700, Grant Kelly wrote:
  


I'm getting really poor write performance with ZFS on a RAID5 volume
(5 disks) from a storagetek 6140 array. I've searched the web and
these forums and it seems that this zfs_nocacheflush option is the
solution, but I'm open to others as well.



What type of poor performance? Is it because of ZFS? You can test this
by creating a RAID-5 volume on the 6140, creating a UFS file system on
it, and then comparing performance with what you get against ZFS.
  


If it's ZFS then you might want to check into modifying the 6540 NVRAM 
as mentioned in this thread


http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html

there is a fix that doesn't involve modifying the NVRAM in the works. (I 
don't have an estimate.)


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS - Use h/w raid or not? Thoughts. Considerations.

2007-05-24 Thread Torrey McMahon
I did say depends on the guarantees, right?  :-)  My point is that all 
hw raid systems are not created equally.


Nathan Kroenert wrote:
Which has little benefit if it's the HBA or the Array internals change 
the meaning of the message...


That's the whole point of ZFS's checksumming - It's end to end...

Nathan.

Torrey McMahon wrote:

Toby Thain wrote:


On 22-May-07, at 11:01 AM, Louwtjie Burger wrote:


On 5/22/07, Pål Baltzersen [EMAIL PROTECTED] wrote:

What if your HW-RAID-controller dies? in say 2 years or more..
What will read your disks as a configured RAID? Do you know how to 
(re)configure the controller or restore the config without 
destroying your data? Do you know for sure that a spare-part and 
firmware will be identical, or at least compatible? How good is 
your service subscription? Maybe only scrapyards and museums will 
have what you had. =o


Be careful when talking about RAID controllers in general. They are
not created equal! ...
Hardware raid controllers have done the job for many years ...


Not quite the same job as ZFS, which offers integrity guarantees 
that RAID subsystems cannot. 


Depend on the guarantees. Some RAID systems have built in block 
checksumming.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: AVS replication vs ZFS send recieve for odd sized volume pairs

2007-05-19 Thread Torrey McMahon

John-Paul Drawneek wrote:

Yes, i am also interested in this.

We can't afford two super fast setup so we are looking at having a huge pile 
sata to act as a real time  backup for all our streams.

So what can AVS do and its limitations are?

Would a just using zfs send and receive do or does AVS make it all seamless?
  


Checkout http://www.opensolaris.org/os/project/avs/Demos/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Re: Lots of overhead with ZFS - what am I doing wrong?

2007-05-19 Thread Torrey McMahon

Jonathan Edwards wrote:


On May 15, 2007, at 13:13, Jürgen Keil wrote:


Would you mind also doing:

ptime dd if=/dev/dsk/c2t1d0 of=/dev/null bs=128k count=1

to see the raw performance of underlying hardware.


This dd command is reading from the block device,
which might cache dataand probably splits requests
into maxphys pieces (which happens to be 56K on an
x86 box).


to increase this to say 8MB, add the following to /etc/system:

set maxphys=0x80

and you'll probably want to increase sd_max_xfer_size as
well (should be 256K on x86/x64) .. add the following to
/kernel/drv/sd.conf:

sd_max_xfer_size=0x80;

then reboot to get the kernel and sd tunings to take.

---
.je

btw - the defaults on sparc:
maxphys = 128K
ssd_max_xfer_size = maxphys
sd_max_xfer_size = maxphys


Maybe we should file a bug to increase the max transfer request sizes?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS Support for remote mirroring

2007-05-09 Thread Torrey McMahon

Anantha N. Srirama wrote:

For whatever reason EMC notes (on PowerLink) suggest that ZFS is not supported 
on their arrays. If one is going to use a ZFS filesystem on top of a EMC array 
be warned about support issues.


They should have fixed that in their matrices. It should say something 
like, EMC supports service LUNs to ZFS.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Support for remote mirroring

2007-05-07 Thread Torrey McMahon

Matthew Ahrens wrote:

Aaron Newcomb wrote:

Does ZFS support any type of remote mirroring? It seems at present my
only two options to achieve this would be Sun Cluster or Availability
Suite. I thought that this functionality was in the works, but I haven't
heard anything lately.


You could put something together using iSCSI, or zfs send/recv.


I think the definition of remote mirror is up for grabs here but in my 
mind remote mirror means the remote node has a always up to date copy of 
the primary data set modulo any transactions in flight. AVS, aka remote 
mirror, aka sndr, is usually used for this kind of work on the host. 
Storage arrays have things like, ahem, remote mirror, truecopy, srdf, etc.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Support for remote mirroring

2007-05-02 Thread Torrey McMahon

Aaron Newcomb wrote:

Does ZFS support any type of remote mirroring? It seems at present my only two 
options to achieve this would be Sun Cluster or Availability Suite. I thought 
that this functionality was in the works, but I haven't heard anything lately.
  


AVS is working today. (See Jim Dunham's frequent posts.) Are you looking 
for something tied directly into ZFS or ???

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS Support for remote mirroring

2007-05-02 Thread Torrey McMahon

Aaron Newcomb wrote:

Terry,

Yes. AVS is pretty expensive. If ZFS did this out of the box it would be a huge 
differentiator. I know ZFS does snapshots today, but if we could extend this 
functionality to work across distance then we would have something that could 
compete with expensive solutions from EMC, HP, IBM, NetApp, etc. And to do it 
with open source software ... even better.


AVS is already open-sourced. Not sure as to the free part but given the 
code is out there ...


http://www.opensolaris.org/os/project/avs/
http://www.opensolaris.org/os/project/avs/files/ for the files.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs boot image conversion kit is posted

2007-05-01 Thread Torrey McMahon

Brian Hechinger wrote:

On Fri, Apr 27, 2007 at 02:44:02PM -0700, Malachi de ??lfweald wrote:
  


2. ZFS mirroring can work without the metadb, but if you want the dump
mirrored too, you need the metadb (I don't know if it needs to be mirrored,
but I wanted both disks to be identical in case one died)



I can't think of any real good reason you would need a mirrored dump device.
The only place that would help you is if your main disk died between panic
and next boot.  ;)
  


If you lose the primary drive, and your dump device points to the 
metadevice, then you wouldn't have to reset it. Also, most folks use the 
swap device for dumps. You wouldn't want to lose that on a live box. 
(Though honestly I've never just yanked the swap device and seen if the 
system keels over.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Boot: Dividing up the name space

2007-05-01 Thread Torrey McMahon

Mike Dotson wrote:

On Sat, 2007-04-28 at 17:48 +0100, Peter Tribble wrote:
  

On 4/26/07, Lori Alt [EMAIL PROTECTED] wrote:


Peter Tribble wrote:
  

snip



  

Why do administrators do 'df' commands?  It's to find out how much space


is used or available in a single file system.   That made sense when file
systems each had their own dedicated slice, but now it doesn't make that
much sense anymore.  Unless you've assigned a quota to a zfs file system,
space available is meaningful more at the pool level.
  

True, but it's actually quite hard to get at the moment. It's easy if
you have a single pool - it doesn't matter which line you look at.
But once you have 2 or more pools (and that's the way it would
work, I expect - a boot pool and 1 or more data pools) there's
an awful lot of output you may have to read. This isn't helped
by zpool and zfs giving different answers., with the one from zfs
being the one I want. The point is that every filesystem adds
additional output the administrator has to mentally filter. (For
one thing, you have to map a directory name to a containing
pool.)



It's actually quite easy and easier than the other alternatives (ufs,
veritas, etc):

# zfs list -rH -o name,used,available,refer rootdg

And now it's setup to be parsed by a script (-H) since the output is
tabbed.  The -r says to recursively display children of the parent and
the -o with the specified fields says to only display the fields
specified.

(output from one of my systems)

blast(9): zfs list -rH -o name,used,available,refer rootdg
rootdg  4.39G   44.1G   32K
rootdg/nvx_wos_62   4.38G   44.1G   503M
rootdg/nvx_wos_62/opt   793M44.1G   793M
rootdg/nvx_wos_62/usr   3.01G   44.1G   3.01G
rootdg/nvx_wos_62/var   113M44.1G   113M
rootdg/swapvol  16K 44.1G   16K

Even tho the mount point is setup as a legacy mount point, I know where
each of them is mounted due to the vol name.


And yes, this system has more than one pool:

blast(10): zpool list
NAMESIZEUSED   AVAILCAP  HEALTH ALTROOT
lpool  17.8G   11.4G   6.32G64%  ONLINE -
rootdg 49.2G   4.39G   44.9G 8%  ONLINE -


  

With zfs, file systems are in many ways more like directories than what
we used to call file systems.   They draw from pooled storage.  They
have low overhead and are easy to create and destroy.  File systems
are sort of like super-functional directories, with quality-of-service
control and cloning and snapshots.  Many of the things that sysadmins
used to have to do with file systems just aren't necessary or even
meaningful anymore.  And so maybe the additional work of managing
more file systems is actually a lot smaller than you might initially think.
  

Oh, I agree. The trouble is that sysadmins still have to work using
their traditional tools, including their brains, which are tooled up
for cases with a much lower filesystem count. What I don't see as
part of this are new tools (or enhancements to existing tools) that
make this easier to handle.



Not sure I agree with this.  Many times, you end up dealing with
multiple vxvol's and file systems.  Anything over 12 filesystems and
you're in overload (at least for me;) and I used my monitoring and
scripting tools to filter that for me. 


Many of the systems I admin'd were setup quite differently based on use
and functionality and disk size.

Most of my tools were setup to take most of these into consideration and
the fact that we ran almost every flavor of UNIX possible using the
features of each OS as appropriate.

Most of the tools will still work with zfs (if using df, etc) but it
actually makes it easier once you have a monitoring issue - running out
of space for example.

Most tools have high and low water marks so when a file system gets too
full, you get a warning.  ZFS makes this much easier to admin as you can
see which file system is being the hog and go directly to that file
system and hunt instead of first finding the file system, hence the
debate of the all-in-one / slice or breaking up to the major os fs's.

Benefit of all-in-one / is you didn't have to guess at how much space
you needed for each slice so you could upgrade, add optional software
without needing to grow/shrink the OS.

Drawback, if you filled up the file system, you had to hunt where it was
filling up - /dev, /usr, /var/tmp, /var, / ??? 


Benefit of multiple slices was one fs didn't affect the others if you
filled it up and you could find which was the problem fs very easily but
if you estimated incorrectly, you had wasted disk space in one slice and
not enough in another.

ZFS gives you the benefit of both all-in-one and partitioned as it draws
from a single pool of storage but also allows you to find which fs is
being the problem and lock it down with quota's and reservations.

  

For example, backup tools are currently filesystem based.



And this changes the scenario how?  

Re: [zfs-discuss] slow sync on zfs

2007-04-23 Thread Torrey McMahon

Dickon Hood wrote:

[snip]

I'm currently playing with ZFS on a T2000 with 24x500GB SATA discs in an
external array that presents as SCSI.  After having much 'fun' with the
Solaris SCSI driver not handling LUNs 2TB


That should work if you have the latest KJP and friends. (Actually, it 
should have been working for a while so if not) What release are you on?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS+NFS on storedge 6120 (sun t4)

2007-04-20 Thread Torrey McMahon

Marion Hakanson wrote:

[EMAIL PROTECTED] said:
  

We have been combing the message boards and it looks like there was a lot of
talk about this interaction of zfs+nfs back in november and before but since
i have not seen much.  It seems the only fix up to that date was to disable
zil, is that still the case?  Did anyone ever get closure on this? 



There's a way to tell your 6120 to ignore ZFS cache flushes, until ZFS
learns to do that itself.  See:
  http://mail.opensolaris.org/pipermail/zfs-discuss/2006-December/024194.html

  


The 6120 isn't the same as a 6130/61340/6540. The instructions 
referenced above won't work on a T3/T3+/6120/6320


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Testing of UFS, VxFS and ZFS

2007-04-17 Thread Torrey McMahon

Anton B. Rang wrote:

Second, VDBench is great for testing raw block i/o devices.
I think a tool that does file system testing will get you
better data.



OTOH, shouldn't a tool that measures raw device performance be reasonable to reflect 
Oracle performance when configured for raw devices? I don't know the current best 
practice for Oracle, but a lot of DBAs still use raw devices instead of files for 
their table spaces
  


Sure, once you charchterize what the performance of the oracle DB us. 
(Read% vs. Write%, i/o size, etc.) VDBench is great for testing the raw 
device with whatever workload you want to test.


Most of the Oracle folks I talk to mention they use fs these days ... 
but that isn't scientific by any stretch.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] snapshot features

2007-04-16 Thread Torrey McMahon

Frank Cusack wrote:
On April 16, 2007 10:24:04 AM +0200 Selim Daoud 
[EMAIL PROTECTED] wrote:

hi all ,

when doing several zfs snapshot of a given fs, there are dependencies
between snapshots that complexify the management of snapshots
is there a plan to easy thes dependencies, so we can reach snapshot
functionalities that are offered in other products suchs as Compellent
(http://www.compellent.com/products/software/continuous_snapshots.aspx)

Compellent software allows to set **retention periods** for different
snapshots and will manage their migration or deletion automatically


retention period is pretty easily managed via cron 


Yeah but cron isn't easily managed by anything. :-P
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Testing of UFS, VxFS and ZFS

2007-04-16 Thread Torrey McMahon

Tony Galway wrote:


I had previously undertaken a benchmark that pits “out of box” 
performance of UFS via SVM, VxFS and ZFS but was waylaid due to some 
outstanding availability issues in ZFS. These have been taken care of, 
and I am once again undertaking this challenge on behalf of my 
customer. The idea behind this benchmark is to show


a. How ZFS might displace the current commercial volume and file 
system management applications being used.


b. The learning curve of moving from current volume management 
products to ZFS.


c. Performance differences across the different volume management 
products.


VDBench is the test bed of choice as this has been accepted by the 
customer as a telling and accurate indicator of performance. The last 
time I attempted this test it had been suggested that VDBench is not 
appropriate to testing ZFS, I cannot see that being a problem, VDBench 
is a tool – if it highlights performance problems, then I would think 
it is a very effective tool so that we might better be able to fix 
those deficiencies.




First, VDBench is a Sun internal and partner only tool so you might not 
get much response on this list.
Second, VDBench is great for testing raw block i/o devices. I think a 
tool that does file system testing will get you better data.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Poor man's backup by attaching/detaching mirror drives on a _striped_ pool?

2007-04-11 Thread Torrey McMahon

Frank Cusack wrote:
On April 11, 2007 11:54:38 AM +0200 Constantin Gonzalez Schmitz 
[EMAIL PROTECTED] wrote:

Hi Mark,

Mark J Musante wrote:

On Tue, 10 Apr 2007, Constantin Gonzalez wrote:


Has anybody tried it yet with a striped mirror? What if the pool is
composed out of two mirrors? Can I attach devices to both mirrors, let
them resilver, then detach them and import the pool from those?


You'd want to export them, not detach them.  Detaching will 
overwrite the

vdev labels and make it un-importable.


thank you for the export/import idea, it does sound cleaner from a ZFS
perspective, but comes at the expense of temporarily unmounting the
filesystems.

So, instead of detaching, would unplugging, then detaching work?

I'm thinking something like this:

 - zpool create tank mirror dev1 dev2 dev3
 - {physically move dev3 to new box}
 - zpool detach tank dev3


If we're talking about a 3rd device, added in order to migrate the data,
why not just zfs send | zfs recv?


Time? The reason people go the split mirror route, at least in block 
land, is because once you split the volume you can export it someplace 
else and start using it. Same goes for constant replication where you 
suspend the replication, take a copy, go start working on it, restart 
the replication. (Lots of ways people do that one.)


I think the requirement could be voiced as, I want an independent copy 
of my data on a secondary system in a quick fashion. I want to avoid 
using resources from the primary system. The fun part is that people 
will think in terms of current technologies so you'll see split 
mirror, or volume copy or Truecopy mixed in for flavor.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Size taken by a zfs symlink

2007-04-02 Thread Torrey McMahon

If I create a symlink inside a zfs file system and point the link to a
file on a ufs file system on the same node how much space should I
expect to see taken in the pool as used? Has this changed in the last
few months? I know work is being done under 6516171 to make symlinks
dittoable but I don't think that has gone back yet. (Has it?)

Thanks.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Re: [storage-discuss] Detecting failed drive under MPxIO + ZFS

2007-03-29 Thread Torrey McMahon

Robert Milkowski wrote:


2. MPxIO - it tries to failover disk to second SP but looks like it
   tries it forever (or very very long). After some time it should
   have generated disk IO failure...
  


Are there any other hosts connected to this storage array? It looks like 
there might be an other host ping-ponging the LUNs with this box.



3. I guess that in such a case Eric's proposal probably won't help and
   the real problem is with MPxIO - right?


WellI wouldn't say it's mpxio's fault either. At least not at this 
point.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS Boot support for the x86 platform

2007-03-28 Thread Torrey McMahon

Richard Elling wrote:

Cyril Plisko wrote:

First of all I'd like to congratulate the ZFS boot team with the
integration of their work into ON. Great job ! I am sure there
are plenty of people waiting anxiously for this putback.

I'd also like to suggest that the material referenced by HEADS UP
message [1] be made available to non-SWAN folks as well.

[1] http://opensolaris.org/os/community/on/flag-days/pages/2007032801/


This has already occurred.
http://www.opensolaris.org/os/community/on/flag-days/61-65/

maybe you were too quick on the trigger? :-)


The case materials aren't there. Also, I think Cyril meant the 
instructions on fs.central mentioned in the flag-day notice.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] zfs send speed

2007-03-20 Thread Torrey McMahon

Howdy folks.

I've a customer looking to use ZFS in a DR situation. They have a large 
data store where they will be taking snapshots every N minutes or so, 
sending the difference of the snapshot and previous snapshot with zfs 
send -i to a remote host, and in case of DR firing up the secondary.


However, I've seen a few references to the speed of zfs send being, 
well, a bit slow. Anyone want to comment on the current speed of zfs 
send? Any recent changes or issues found in this area?


Thanks.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] zfs send speed

2007-03-20 Thread Torrey McMahon

Matthew Ahrens wrote:

Torrey McMahon wrote:

Howdy folks.

I've a customer looking to use ZFS in a DR situation. They have a 
large data store where they will be taking snapshots every N minutes 
or so, sending the difference of the snapshot and previous snapshot 
with zfs send -i to a remote host, and in case of DR firing up the 
secondary.


Cool!


I sure hope so. ;-)



However, I've seen a few references to the speed of zfs send being, 
well, a bit slow. Anyone want to comment on the current speed of zfs 
send? Any recent changes or issues found in this area?


What bits are you running?  I made some recent improvements (6490104, 
fixed in build 53, targeted for s10u4).  There are still a few issues, 
but by and large, performance should be very good.


Can you describe what problem you're experiencing?  How much data, how 
many files, how big of a stream, what transport, how long it takes, 
are you seeing lots of CPU or disk activity on the sending or 
receiving side when it's slow?


I'm only doing an initial investigation now so I have no test data at 
this point. The reason I asked, and I should have tacked this on at the 
end of the last email, was a blog entry that stated zfs send was slow


http://www.lethargy.org/~jesus/archives/80-ZFS-send-trickle..html

Looking back through the discuss archives I didn't see anything else 
mentioned but some others mentioned it to me off line as well. It could 
be we all read the same blog entry so I figured I'd ask if anyone had 
seen such behavior recently. Hopefully, I can get a test bed setup 
fairly quickly and see how it works myself.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS party - PANIC collection

2007-03-14 Thread Torrey McMahon

Gino Ruopolo wrote:

Conclusion:
After a day of tests we are going to think that ZFS
  

doesn't work well with MPXIO.

  
  

What kind of array is this? If it is not a Sun array
then how are you 
configuring mpxio to recognize the array?



We are facing the same problems with a JBOD  (EMC DAE2), a Storageworks EVA and 
an old Storageworks EMA.
  


What makes you think that these arrays work with mpxio? Every array does 
not automatically work.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Google paper on disk reliability

2007-02-19 Thread Torrey McMahon

Richard Elling wrote:

Akhilesh Mritunjai wrote:
I believe that the word would have gone around already, Google 
engineers have published a paper on disk reliability. It might 
supplement the ZFS FMA integration and well - all the numerous 
debates on spares etc etc over here.


Good paper.  They validate the old saying, complex systems fail in 
complex ways.
We've also done some internal (Sun) studies which cast doubt on the 
ability of SMART
to predict failures. 


 which is why we were never really fans of turning it on.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS with SAN Disks and mutipathing

2007-02-18 Thread Torrey McMahon

Richard Elling wrote:

JS wrote:
I'm using ZFS on both EMC and Pillar arrays with PowerPath and MPxIO, 
respectively. Both work fine - the only caveat is to drop your 
sd_queue to around 20 or so, otherwise you can run into an ugly 
display of bus resets.


This is sd_max_throttle or ssd_max_throttle.  The problem is that the 
host can
easily overrun the storage for slow storage devices.  This will reduce 
the load
on the storage device.  Consult the storage configuration guidelines 
for recommended
values (default = 256 outstanding commands, in the old days EMC 
recommended 20).
Yes, we'd all like this problem to go away. 


An other note: This drops the queue size for all devices that use the sd 
or ssd driver.


I'm still not sure why EMC/HDS/Pillar boxes can't send a queue full 
response back when they start to fill up like other storage arrays do. 
It gets even worse when you have to do the HDS Math to set all your 
hosts to some low queue size.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Is ZFS file system supports short writes ?

2007-02-15 Thread Torrey McMahon

Robert Milkowski wrote:


Hello dudekula,


Thursday, February 15, 2007, 11:08:26 AM, you wrote:






Hi all,

 


Please let me know the ZFS support for short writes ?

 




And what are short writes?



http://www.pittstate.edu/wac/newwlassignments.html#ShortWrites :-P
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] number of lun's that zfs can handle

2007-02-15 Thread Torrey McMahon

Claus Guttesen wrote:


Our main storage is a HDS 9585V Thunder with vxfs and raid5 on 400 GB
sata disk handled by the storage system. If I would migrate to zfs
that would mean 390 jbod's. 


How so?
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Thumper Origins Q

2007-02-02 Thread Torrey McMahon

Richard Elling wrote:


One of the benefits of ZFS is that not only is head synchronization not
needed, but also block offsets do not have to be the same.  For example,
in a traditional mirror, block 1 on device 1 is paired with block 1 on
device 2.  In ZFS, this 1:1 mapping is not required.  I believe this will
result in ZFS being more resilient to disks with multiple block failures.
In order for a traditional RAID to implement this, it would basically
need to [re]invent a file system.


We had this fixed in T3 land awhile ago so I think most storage arrays 
don't do the 1:1 mapping anymore. It's striped down the drives. In 
theory, you could lose more then one drive in a T3 mirror and still 
maintain data in certain situations.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: Thumper Origins Q

2007-02-02 Thread Torrey McMahon

Dale Ghent wrote:



Yeah sure it might eat into STK profits, but one will still have to 
go there for redundant controllers.


Repeat after me: There is no STK. There is only Sun. 8-)


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] hot spares - in standby?

2007-02-02 Thread Torrey McMahon

Richard Elling wrote:


Good question. If you consider that mechanical wear out is what 
ultimately

causes many failure modes, then the argument can be made that a spun down
disk should last longer. The problem is that there are failure modes 
which
are triggered by a spin up.  I've never seen field data showing the 
difference

between the two.


Often, the spare is up and running but for whatever reason you'll have a 
bad block on it and you'll die during the reconstruct. Periodically 
checking the spare means reading and writing from over time in order to 
make sure it's still ok. (You take the spare out of the trunk, you look 
at it, you check the tire pressure, etc.) The issue I see coming down 
the road is that we'll start getting into a Golden Gate paint job 
where it takes so long to check the spare that we'll just keep the 
process going constantly. Not as much wear and tear as real i/o but it 
will still be up and running the entire time and you won't be able to 
spin the spare down.



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Re: ZFS or UFS - what to do?

2007-02-02 Thread Torrey McMahon

Marion Hakanson wrote:

However, given the default behavior of ZFS (as of Solaris-10U3) is to
panic/halt when it encounters a corrupted block that it can't repair,
I'm re-thinking our options, weighing against the possibility of a
significant downtime caused by a single-block corruption.


Guess what happens when UFS finds an inconsistency it can't fix either?

The issue is that ZFS has the chance to fix the inconsistency if the 
zpool is a mirror or raidZ. Not that it finds the inconsistency in the 
first place. ZFS will just find more of them given a set of errors vs 
other filesystems.




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Project Proposal: Availability Suite

2007-02-02 Thread Torrey McMahon

Nicolas Williams wrote:

On Fri, Jan 26, 2007 at 05:15:28PM -0700, Jason J. W. Williams wrote:
  

Could the replication engine eventually be integrated more tightly
with ZFS? That would be slick alternative to send/recv.



But a continuous zfs send/recv would be cool too.  In fact, I think ZFS
tightly integrated with SNDR wouldn't be that much different from a
continuous zfs send/recv.


Even better with snapshots, and scoreboarding, and synch vs asynch and 
and and and .


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


  1   2   3   >