Re: [zfs-discuss] ZFS and VMware

2010-08-15 Thread Richard Elling
On Aug 11, 2010, at 12:52 PM, Paul Kraus wrote:

>   I am looking for references of folks using ZFS with either NFS
> or iSCSI as the backing store for VMware (4.x) backing store for
> virtual machines. We asked the local VMware folks and they had not
> even heard of ZFS. Part of what we are looking for is a recommendation
> for NFS or iSCSI, and all VMware would say is "we support both". We
> are currently using Sun SE-6920, 6140, and 2540 hardware arrays via
> FC. We have started playing with ZFS/NFS, but have no experience with
> iSCSI. The ZFS backing store in some cases will be the hardware arrays
> (the 6920 has fallen off of VMware's supported list and if we front
> end it with either NFS or iSCSI it'll be supported, and VMware
> suggested that) and some of it will be backed by J4400 SATA disk.

At Nexenta, we have many customers using ZFS as backing store for
VMware and Citrix XenServer.  Nexenta also has a plugin to help you
integrate your VMware, XenServer, and Hyper-V virtual hosts with the
storage appliance. For more info, see
http://www.nexenta.com/corp/applications/vmdc
and the latest Nexenta docs, including the VMDC User's Guide are at:
http://www.nexenta.com/corp/documentation/product-documentation

Please share and enjoy that the joint EMC+NetApp storage best practices 
for configuring ESX applies to all NFS and block storage (*SCSI) 
environments. Google "TR-3428" and point me in the direction of any
later versions you find :-)

 -- richard

-- 
Richard Elling
rich...@nexenta.com   +1-760-896-4422
Enterprise class storage for everyone
www.nexenta.com



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-14 Thread Ross Walker
On Aug 14, 2010, at 8:26 AM, "Edward Ned Harvey"  wrote:

>> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>> boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
>> 
>> #3  I previously believed that vmfs3 was able to handle sparse files
>> amazingly well, like, when you create a new vmdk, it appears almost
>> instantly regardless of size, and I believed you could copy sparse
>> vmdk's
>> efficiently, not needing to read all the sparse consecutive zeroes.  I
>> was
>> wrong.  
> 
> Correction:  I was originally right.  ;-)  
> 
> In ESXi, if you go to command line (which is busybox) then sparse copies are
> not efficient.
> If you go into vSphere, and browse the datastore, and copy vmdk files via
> gui, then it DOES copy efficiently.
> 
> The behavior is the same, regardless of NFS vs iSCSI.
> 
> You should always copy files via GUI.  That's the lesson here.

Technically you should always copy vmdk files via vmfstool on the command line. 
That will give you wire speed transfers.

-Ross

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-14 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Edward Ned Harvey
> 
> #3  I previously believed that vmfs3 was able to handle sparse files
> amazingly well, like, when you create a new vmdk, it appears almost
> instantly regardless of size, and I believed you could copy sparse
> vmdk's
> efficiently, not needing to read all the sparse consecutive zeroes.  I
> was
> wrong.  

Correction:  I was originally right.  ;-)  

In ESXi, if you go to command line (which is busybox) then sparse copies are
not efficient.
If you go into vSphere, and browse the datastore, and copy vmdk files via
gui, then it DOES copy efficiently.

The behavior is the same, regardless of NFS vs iSCSI.

You should always copy files via GUI.  That's the lesson here.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Paul Kraus
> 
>I am looking for references of folks using ZFS with either NFS
> or iSCSI as the backing store for VMware (4.x) backing store for
> virtual machines. 

Since I had ulterior motives to test this, I spent a lot of time today
working on this anyway.  So I figured I might as well post some results
here:

#1  If there's any performance difference between iscsi vs nfs, it was
undetectable to me.  If there's any difference at all, nfs might be faster
in some cases.
#2  I previously speculated that performance of iscsi would outperform nfs,
because I thought vmware would create a file on NFS and then format that
file with vmfs3, thus doubling filesystem overhead.  I was wrong.  In
reality, ESXi uses the NFS datastore "raw."  Meaning, if you create some new
VM named "junk" with associated disks "junk.vmdk" etc, then those files are
created inside the NFS file server just like any other normal files.  There
is no vmfs3 overhead in between.
#3  I previously believed that vmfs3 was able to handle sparse files
amazingly well, like, when you create a new vmdk, it appears almost
instantly regardless of size, and I believed you could copy sparse vmdk's
efficiently, not needing to read all the sparse consecutive zeroes.  I was
wrong.  In reality, vmfs3 doesn't seem to have any advantage over *any*
other filesystems (ntfs, ext3, hfs+, etc) to create and occupy disk space
with the sparse files.  They do not copy efficiently.  I found that copying
a large sparse vmdk file, for all intents and purposes, works just as well
inside vmfs3 as it does in nfs.

Those things being said ... I couldn't find one reason at all in favor of
iscsi over nfs.  Except, perhaps, the authentication which may or may not be
stronger security than NFS in a less-than-trusted LAN.

iscsi requires more work to setup.
iscsi has more restrictions on it - You have to choose a size, and can't
expand it.  It's formatted vmfs3, so you cannot see the contents in any way
other than mounting it in esx.

I could not find even one thing, to promote iscsi over nfs.

Although it seems unlikely, if you wanted to disable ZIL instead of buying
log devices on the ZFS host, you can easily do this for NFS, and I'm not
aware of any way to do it with iscsi.  Maybe you can, I don't know.

I mean ... It wasn't like Mike Tyson beating up a little kid, but it was
like a grown-up beating up an adolescent.  ;-)  Extremely one-sided as far
as I can tell.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread David Magda
On Fri, August 13, 2010 11:39, F. Wessels wrote:
> I wasn't planning to buy any SSD as a ZIL. I merely acknowledged that an
> sandforce with supercap MIGHT be a solution. At least the supercap should
> take care of the data loss in case of a power failure. But they are still
> in the consumer realm have not been picked up by the enterprise (yet) for
> whatever reason. I must admit that I've heard that the sandforce's didn't
> really live up to their expectations at least as an slog device.

IBM appears to used SandForce for some stuff:

http://tinyurl.com/3xtvch4
http://www.engadget.com/2010/05/03/sandforce-makes-ssds-cheaper-faster-more-reliable-just-how/


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread F. Wessels
I wasn't planning to buy any SSD as a ZIL. I merely acknowledged that an 
sandforce with supercap MIGHT be a solution. At least the supercap should take 
care of the data loss in case of a power failure. But they are still in the 
consumer realm have not been picked up by the enterprise (yet) for whatever 
reason. I must admit that I've heard that the sandforce's didn't really live up 
to their expectations at least as an slog device.
I think a lot people on this mailing list would be very interested in your 
evaluation of the SSD's to prevent costly mistakes.

Thanks for the scripts, I'll send you an email about them. 

And for everybody else here's a good entry about the DDRdrive X1:
http://blogs.sun.com/ahl/entry/ddrdrive
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread Saxon, Will
> -Original Message-
> From: zfs-discuss-boun...@opensolaris.org 
> [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Eff Norwood
> Sent: Friday, August 13, 2010 10:26 AM
> To: zfs-discuss@opensolaris.org
> Subject: Re: [zfs-discuss] ZFS and VMware
> 
> Don't waste your time with something other than the DDRdrive 
> for NFS ZIL. If it's RAM based it might work, but why risk it 
> and if it's an SSD forget it. No SSD will work well for the 
> ZIL long term. Short term the only SSD to consider would be 
> Intel, but again long term even that will not work out for 
> you. The 100% write characteristics of the ZIL are an SSDs 
> worst case scenario especially without TRIM support. We have 
> tried them all - Samsung, SanDisk, OCZ and none of those 
> worked out. In particular, anything Sandforce 1500 based was 
> the worst so avoid those at all costs if you dare to try an 
> SSD ZIL. Don't. :)

What was the observed behavior with the SF-1500 based SSDs? I was planning to 
purchase something based on these next year, specifically to be SLOG. 

-Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread Eff Norwood
Don't waste your time with something other than the DDRdrive for NFS ZIL. If 
it's RAM based it might work, but why risk it and if it's an SSD forget it. No 
SSD will work well for the ZIL long term. Short term the only SSD to consider 
would be Intel, but again long term even that will not work out for you. The 
100% write characteristics of the ZIL are an SSDs worst case scenario 
especially without TRIM support. We have tried them all - Samsung, SanDisk, OCZ 
and none of those worked out. In particular, anything Sandforce 1500 based was 
the worst so avoid those at all costs if you dare to try an SSD ZIL. Don't. :)

As for the queue depths, here's the command from the ZFS Evil Tuning Guide:

echo zfs_vdev_max_pending/W0t10 | mdb -kw

The W0t10 command is what to change. W0t35 (35 seconds) was the old value, 10 
is the new one. For our NFS environment, we found W0t2 was the best by looking 
at the actual IO using dtrace scripts. Email me if you want those scripts. They 
are here, but need to be edited before they work:

http://blogs.sun.com/chrisg/entry/latency_bubble_in_your_io
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread F. Wessels
Yes, the sandforce based ssd's are also interesting. I think both, the 1500 
sure can, could be fitted with the necessary supercap to prevent dataloss in 
case of unexpected power loss. And the 1500 based models will available with a 
SAS interface needed for clustering. Something the DDRdrive cannot do. BUT at 
this moment they certainly do not match the DDRdrive in performance and 
probably also not in MTBF. The DDRdrive only writes to ddr ram, hence the name. 
Only in case of power loss the ram contents will be written to flash. At least 
this is what I understand and know of it. The ddr ram doesn't suffer the 
wear/degradation any flash memory type suffers.
You can buy multiple sandforce ssd's with supercap for the price of a single 
DDRdrive X1. Choice is good!
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread David Magda
On Fri, August 13, 2010 07:21, F. Wessels wrote:
> I fully agree with your post. NFS is much simpler in administration.
> Although I don't have any experience with the DDRdrive X1, I've read and
> heard from various people actually using them that it's the best
> "available" SLOG device. Before everybody starts yelling "ZEUS" or
> "LOGZILLA". Was anybody able to buy one? Apart from SUN. The DDRdrive X1
> is available and you can buy several for one ZEUS.

STEC only sells to OEMs at this time. From past discussions on this list,
I think the only dependable SSD alternative are devices based on the
SandForce SF-1500 controller:

http://www.google.com/search?q=SandForce+SF-1500

For all other products, there are question of the devices respecting SYNC
commands (i.e., not lying about them), and issues with the lack of
supercaps.

The SandForce/ZFS thread (January 2010: "preview of new SSD based on
SandForce controller") can be found at:

http://tinyurl.com/2c6hvqs#35376
http://mail.opensolaris.org/pipermail/zfs-discuss/2010-January/thread.html#35376


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-13 Thread F. Wessels
I fully agree with your post. NFS is much simpler in administration. 
Although I don't have any experience with the DDRdrive X1, I've read and heard 
from various people actually using them that it's the best "available" SLOG 
device. Before everybody starts yelling "ZEUS" or "LOGZILLA". Was anybody able 
to buy one? Apart from SUN. The DDRdrive X1 is available and you can buy 
several for one ZEUS.
Good to hear another success story. As soon as I have budget I'm going to buy a 
pair of them. 
My question, what about I/O queue depths? Which queues those over at VMware or 
on OpenSolaris? Can you give some examples and actual settings?
Oh, in what chassis model have you mounted the DDRdrive?

Thanks in advance
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-12 Thread Miles Nordin
> "sw" == Saxon, Will  writes:

sw> It was and may still be common to use RDM for VMs that need
sw> very high IO performance. It also used to be the only
sw> supported way to get thin provisioning for an individual VM
sw> disk. However, VMware regularly makes a lot of noise about how
sw> VMFS does not hurt performance enough to outweigh its benefits

What's the performance of configuring the guest to boot off iSCSI or
NFS directly using its own initiator/client, through the virtual
network adapter?  Is it better or worse, and does it interfere with
migration?

or is this difficult enough that no one using vmware does it, that
anyone who does this would already be using Xen and in-house scripts
instead of vmware black-box proprietary crap?

It seems to me a native NFS guest would go much easier on the DDT.  I
found it frustrating I could not change the blocksize of XFS: it is
locked at 4kB.

I would guess there is still no vIB adapter, so if you want to use SRP
you are stuck presenting IB storage to guests with vmware virtual scsi
card.

but I don't know if the common wisdom, ``TOE is a worthless gimmick.
modern network cards and TCP stacks are as good as SCSI cards,'' still
applies when the adapters are virtual ones, so I'd be interested to
hear from someone running guests in this way (without virtual storage
adapters).



pgp1AHU9P9Hus.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-12 Thread Richard Jahnel
We are using zfs backed fibre targets for ESXi 4.1 and previously 4.0 and have 
had good performance with no issues. The fibre LUNS were formated with vmfs by 
the ESXi boxes.

SQLIO benchmarks from guest system running on fibre attacted ESXi host.

File Size MBThreads Read/Write  DurationSector Size KB  Pattern 
IOs oustanding  IO/Sec  MB/Sec  Lat. Min.   Lat. Ave.   Lat. Max.

24576   8   R   30  8   random  64  37645   294 0   
1   141

24576   8   W   30  8   random  64  17304   135 0   
3   303

24576   8   R   30  64  random  64  6250391 1   
9   176

24576   8   W   30  64  random  64  5742359 1   
10  203

The array is a raidz2 with 14 x 256 gb Patriot Torqx drives and a cache with 4 
x 32 gb intel 32 GB G1s

When I get around to doing the next series of boxes I'll probably use c300s in 
place of the indellix based drives.

iSCSI was disappointing and seemed to be CPU bound. Possibly by a stupid amount 
of interupts coming from the less than stellar nic on the test box.

NFS we have only used as an ISO store, but it has worked ok and without issues.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-12 Thread Eff Norwood
We are doing NFS in VMWare 4.0U2 production, 50K users using OpenSolaris 
SNV_134 on SuperMicro boxes with SATA drives. Yes, I am crazy. Our experience 
has been that iSCSI for ESXi 4.x is fast and works well with minimal fussing 
until there is a problem. When that problem happens, getting to data on VMFS 
LUNs even with the free java VMFS utility to do so is problematic at best and 
game over at worst.

With NFS, data access in problem situations is a non event. Snapshots happen 
and everyone is happy. The problem with it is the VMWare NFS client which makes 
every write an F_SYNC write. That kills NFS performance dead. To get around 
that, we're using DDRdrive X1s for our ZIL and the problem is solved. I have 
not looked at the NFS client changes in 4.1, perhaps it's better or at least 
tuneable now.

I would recommend NFS as the overall strategy, but you must get a good ZIL 
device to make that happen. Do not disable the ZIL. Do make sure you set your 
I/O queue depths correctly.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-12 Thread Paul Kraus
On Wed, Aug 11, 2010 at 6:15 PM, Saxon, Will  wrote:

>
> It really depends on your VM system, what you plan on doing with VMs and how 
> you plan to do it.
>
> I have the vSphere Enterprise product and I am using the DRS feature, so VMs 
> are vmotioned around
> my cluster all throughout the day. All of my VM users are able to create and 
> manage their own VMs
> through the vSphere client. None of them care to know anything about VM 
> storage as long as it's
> fast, and most of them don't want to have to make choices about which 
> datastore to put their new
> VM on. Only 30-40% of the total number of VMs registered in the cluster are 
> powered on at any given time.

We have three production VMware VSphere 4 clusters, each with
four hosts. The number of guests varies, but ranges from a low of 40
on one cluster to 80 on another. We do not generally have many guests
being created or destroyed, but a slow steady growth in their numbers.

The guests are both production as well as test / development
and the vast majority of them are Windows, mostly Server 2008. The
rule is to roll out Windows servers as VMs with the notable exception
of the Exchange servers, which are physical servers. The VMs are used
for everything including domain controllers, file servers, print
servers, dhcp servers, dns servers, workstations (my physical desktop
run Linux but I need a Windows system for Outlook and a few other
applications, that runs as a VM), SharePoint servers, MS-SQL servers,
and other assorted application servers.

We are using DRS and VMs do migrate around a bit
(transparently). We take advantage of "maintenance mode" for exactly
what the name says.

We have had a fairly constant, but low rate of FC issues with
VMware, from when we first rolled out VMware (version 3.0) through
today (4.1). The multi-pathing seems to occasionally either loose one
or more paths to a given LUN or completely loose access to a given
LUN. These problems do not happen often, but when they do it has
caused downtime on production VMs. Part of the reason we started
looking at NFS/iSCSI was to get around the VMware (Linux) FC drivers.
We also like the low overhead snapshot feature of ZFS (and are
leveraging it for other data extensively).

Now we are getting serious about using ZFS + NFS/iSCSI and are
looking to learn from other's experience as well as our own. For
example, is anyone using NFS with Oracle Cluster for HA storage for
VMs or are sites trusting to a single NFS server ?

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Tim Cook
>
>
>
> My understanding is that if you wanted to use MS Cluster Server, you'd need
> to use a LUN as an RDM for the quorum drive. VMDK files are locked when
> open, so they can't typically be shared. VMware's Fault Tolerance gets
> around this somehow, and I have a suspicion that their Lab Manager product
> does as well.
>
>
Right, but again, we're talking about storing virtual machines, not RDM's.
 Using MSCS on top of VMware rarely makes any sense, and MS is doing
their damnedest to make it as painful as possible for those that try
anyways.  There's nothing stopping you from putting your virtual machine on
an NFS datastore, and mounting a LUN directly to the guest OS with a
software iSCSI client and cutting out the middleman and bypassing the RDM
entirely... which just adds yet another headache when it comes to things
like SRM and vmotion.




> I don't think you can use VMware's built-in multipathing with NFS. Maybe
> it's possible, it doesn't look that way but I'm not going to verify it one
> way or the other. There are probably better/alternative ways to achieve the
> same thing with NFS.
>

You can achieve the same thing with a little bit of forethought on your
network design.  No, ALUA is not compatible with NFS, it is a block protocol
feature.  Then again, ALUA is also not compatible with the MSCS example you
listed above.




> The new VAAI stuff that VMware announced with vSphere 4.1 does not support
> NFS (yet), it only works with storage servers that implement the requires
> commands.
>
>
VAAI is an attempt to give block more NFS-like features (for instance,
finer-grained locking which already exists in NFS by default).  The
"features" are basically useless in an NFS environment on intelligent
storage.



The locked LUN thing has happened to me once. I've had more trouble with
> thin provisioning and negligence leading to a totally-full VMFS, which is
> irritating to recover from, and moved/restored luns needing VMFS
> resignaturing, which is also irritating.
>
> I don't want to argue with you about the other stuff.
>
>
Which is why block with vmware blows :)

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Saxon, Will
> -Original Message-
> From: Tim Cook [mailto:t...@cook.ms] 
> Sent: Wednesday, August 11, 2010 10:42 PM
> To: Saxon, Will
> Cc: Edward Ned Harvey; ZFS Discussions
> Subject: Re: [zfs-discuss] ZFS and VMware
> 
> 
>   I still think there are reasons why iSCSI would be 
> better than NFS and vice versa.
>   
>   
> 
> 
> I'd love for you to name one.  Short of a piss-poor NFS 
> server implementation, I've never once seen iSCSI beat out 
> NFS in a VMware environment.  I have however seen countless 
> examples of their "clustered filesystem" causing permanent 
> SCSI locks on a LUN that result in an entire datastore going offline.

My understanding is that if you wanted to use MS Cluster Server, you'd need to 
use a LUN as an RDM for the quorum drive. VMDK files are locked when open, so 
they can't typically be shared. VMware's Fault Tolerance gets around this 
somehow, and I have a suspicion that their Lab Manager product does as well. 

I don't think you can use VMware's built-in multipathing with NFS. Maybe it's 
possible, it doesn't look that way but I'm not going to verify it one way or 
the other. There are probably better/alternative ways to achieve the same thing 
with NFS.

The new VAAI stuff that VMware announced with vSphere 4.1 does not support NFS 
(yet), it only works with storage servers that implement the requires commands. 

The locked LUN thing has happened to me once. I've had more trouble with thin 
provisioning and negligence leading to a totally-full VMFS, which is irritating 
to recover from, and moved/restored luns needing VMFS resignaturing, which is 
also irritating.

I don't want to argue with you about the other stuff. 

-Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware (and now, VirtualBox)

2010-08-11 Thread Erik Trimble
Actually, this brings up a related issue. Does anyone have experience
with running VirtualBox on iSCSI volumes vs NFS shares, both of which
would be backed by a ZFS server?

-Erik



On Wed, 2010-08-11 at 21:41 -0500, Tim Cook wrote:
> 
> 
> 
> This is not entirely correct either. You're not forced to use
> VMFS.
>  
> It is entirely true.  You absolutely cannot use ESX with a guest on a
> block device without formatting the LUN with VMFS.  You are *FORCED*
> to use VMFS.
>  
> 
> 
> You can format the LUN with VMFS, then put VM files inside the
> VMFS; in this case you get the Guest OS filesystem inside a
> VMDK file on the VMFS filesystem inside a LUN/ZVOL on your ZFS
> filesystem. You can also set up Raw Device Mapping (RDM)
> directly to a LUN, in which case you get the Guest OS
> filesystem inside the LUN/ZVOL on your ZFS filesystem. There
> has to be VMFS available somewhere to store metadata, though.
> 
>  
> You cannot boot a VM off an RDM.  You *HAVE* to use VMFS with block
> devices for your guest operating systems.  Regardless, we aren't
> talking about RDM's, we're talking about storing virtual machines.
>  
> 
> 
> It was and may still be common to use RDM for VMs that need
> very high IO performance. It also used to be the only
> supported way to get thin provisioning for an individual VM
> disk. However, VMware regularly makes a lot of noise about how
> VMFS does not hurt performance enough to outweigh its benefits
> anymore, and thin provisioning has been a native/supported
> feature on VMFS datastores since 4.0.
> 
> I still think there are reasons why iSCSI would be better than
> NFS and vice versa.
> 
> 
> 
> I'd love for you to name one.  Short of a piss-poor NFS server
> implementation, I've never once seen iSCSI beat out NFS in a VMware
> environment.  I have however seen countless examples of their
> "clustered filesystem" causing permanent SCSI locks on a LUN that
> result in an entire datastore going offline.
> 
> 
> --Tim
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
-- 
Erik Trimble
Java System Support
Mailstop:  usca22-123
Phone:  x17195
Santa Clara, CA
Timezone: US/Pacific (GMT-0800)

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Tim Cook
>
>
>
> This is not entirely correct either. You're not forced to use VMFS.
>

It is entirely true.  You absolutely cannot use ESX with a guest on a block
device without formatting the LUN with VMFS.  You are *FORCED* to use VMFS.


You can format the LUN with VMFS, then put VM files inside the VMFS; in this
> case you get the Guest OS filesystem inside a VMDK file on the VMFS
> filesystem inside a LUN/ZVOL on your ZFS filesystem. You can also set up Raw
> Device Mapping (RDM) directly to a LUN, in which case you get the Guest OS
> filesystem inside the LUN/ZVOL on your ZFS filesystem. There has to be VMFS
> available somewhere to store metadata, though.
>
>
You cannot boot a VM off an RDM.  You *HAVE* to use VMFS with block devices
for your guest operating systems.  Regardless, we aren't talking about
RDM's, we're talking about storing virtual machines.


It was and may still be common to use RDM for VMs that need very high IO
> performance. It also used to be the only supported way to get thin
> provisioning for an individual VM disk. However, VMware regularly makes a
> lot of noise about how VMFS does not hurt performance enough to outweigh its
> benefits anymore, and thin provisioning has been a native/supported feature
> on VMFS datastores since 4.0.
>
> I still think there are reasons why iSCSI would be better than NFS and vice
> versa.
>
>
I'd love for you to name one.  Short of a piss-poor NFS server
implementation, I've never once seen iSCSI beat out NFS in a VMware
environment.  I have however seen countless examples of their "clustered
filesystem" causing permanent SCSI locks on a LUN that result in an entire
datastore going offline.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Saxon, Will
> -Original Message-
> From: zfs-discuss-boun...@opensolaris.org 
> [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Tim Cook
> Sent: Wednesday, August 11, 2010 8:46 PM
> To: Edward Ned Harvey
> Cc: ZFS Discussions
> Subject: Re: [zfs-discuss] ZFS and VMware
> 
> 
> 
> On Wed, Aug 11, 2010 at 7:27 PM, Edward Ned Harvey 
>  wrote:
> 
> 
>   > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
>   > boun...@opensolaris.org] On Behalf Of Paul Kraus
>   >
>   
>   >I am looking for references of folks using ZFS 
> with either NFS
>   > or iSCSI as the backing store for VMware (4.x) 
> backing store for
>   
>   
>   I'll try to clearly separate what I know, from what I speculate:
>   
>   I know you can do either one, NFS or iscsi served by 
> ZFS for the backend
>   datastore used by ESX.  I know (99.9%) that vmware will 
> issue sync-mode
>   operations in both cases.  Which means you are strongly 
> encouraged to use a
>   mirrored dedicated log device, presumably SSD or some 
> sort of high IOPS low
>   latency devices.
>   
>   I speculate that iscsi will perform better.  If you 
> serve it up via NFS,
>   then vmware is going to create a file in your NFS 
> filesystem, and inside
>   that file it will create a new filesystem.  So you get 
> twice the filesytem
>   overhead.  Whereas in iscsi, ZFS presents a raw device 
> to VMware, and then
>   vmware maintains its filesystem in that.
>   
> 
> 
> 
> 
> That's not true at all.  Whether you use iSCSI or NFS, VMware 
> is laying down a file which it presents as a disk to the 
> guest VM which then formats it with its own filesystem.  
> That's the advantage of virtualization.  You've got a big 
> file you can pick up and move anywhere that is hardware 
> agnostic.  With iSCSI, you're forced to use VMFS, which is an 
> adaptation of the legato clustered filesystem from the early 
> 90's.  It is nowhere near as robust as NFS, and I can't think 
> of a reason you would use it if given the choice; short of a 
> massive pre-existing investment in Fibre Channel.  With NFS, 
> you're simply using ZFS, there is no VMFS to worry about.  
> You don't have to have another ESX box if something goes 
> wrong, any client with an nfs client can mount the share and 
> diagnose the VMDK.
> 
> --Tim
> 

This is not entirely correct either. You're not forced to use VMFS.

You can format the LUN with VMFS, then put VM files inside the VMFS; in this 
case you get the Guest OS filesystem inside a VMDK file on the VMFS filesystem 
inside a LUN/ZVOL on your ZFS filesystem. You can also set up Raw Device 
Mapping (RDM) directly to a LUN, in which case you get the Guest OS filesystem 
inside the LUN/ZVOL on your ZFS filesystem. There has to be VMFS available 
somewhere to store metadata, though.

It was and may still be common to use RDM for VMs that need very high IO 
performance. It also used to be the only supported way to get thin provisioning 
for an individual VM disk. However, VMware regularly makes a lot of noise about 
how VMFS does not hurt performance enough to outweigh its benefits anymore, and 
thin provisioning has been a native/supported feature on VMFS datastores since 
4.0.

I still think there are reasons why iSCSI would be better than NFS and vice 
versa.

-Will
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Tim Cook
On Wed, Aug 11, 2010 at 7:27 PM, Edward Ned Harvey wrote:

> > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> > boun...@opensolaris.org] On Behalf Of Paul Kraus
> >
> >I am looking for references of folks using ZFS with either NFS
> > or iSCSI as the backing store for VMware (4.x) backing store for
>
> I'll try to clearly separate what I know, from what I speculate:
>
> I know you can do either one, NFS or iscsi served by ZFS for the backend
> datastore used by ESX.  I know (99.9%) that vmware will issue sync-mode
> operations in both cases.  Which means you are strongly encouraged to use a
> mirrored dedicated log device, presumably SSD or some sort of high IOPS low
> latency devices.
>
> I speculate that iscsi will perform better.  If you serve it up via NFS,
> then vmware is going to create a file in your NFS filesystem, and inside
> that file it will create a new filesystem.  So you get twice the filesytem
> overhead.  Whereas in iscsi, ZFS presents a raw device to VMware, and then
> vmware maintains its filesystem in that.
>
>
>
That's not true at all.  Whether you use iSCSI or NFS, VMware is laying down
a file which it presents as a disk to the guest VM which then formats it
with its own filesystem.  That's the advantage of virtualization.  You've
got a big file you can pick up and move anywhere that is hardware agnostic.
 With iSCSI, you're forced to use VMFS, which is an adaptation of the legato
clustered filesystem from the early 90's.  It is nowhere near as robust as
NFS, and I can't think of a reason you would use it if given the choice;
short of a massive pre-existing investment in Fibre Channel.  With NFS,
you're simply using ZFS, there is no VMFS to worry about.  You don't have to
have another ESX box if something goes wrong, any client with an nfs client
can mount the share and diagnose the VMDK.

--Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Edward Ned Harvey
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
> boun...@opensolaris.org] On Behalf Of Paul Kraus
> 
>I am looking for references of folks using ZFS with either NFS
> or iSCSI as the backing store for VMware (4.x) backing store for

I'll try to clearly separate what I know, from what I speculate:

I know you can do either one, NFS or iscsi served by ZFS for the backend
datastore used by ESX.  I know (99.9%) that vmware will issue sync-mode
operations in both cases.  Which means you are strongly encouraged to use a
mirrored dedicated log device, presumably SSD or some sort of high IOPS low
latency devices.

I speculate that iscsi will perform better.  If you serve it up via NFS,
then vmware is going to create a file in your NFS filesystem, and inside
that file it will create a new filesystem.  So you get twice the filesytem
overhead.  Whereas in iscsi, ZFS presents a raw device to VMware, and then
vmware maintains its filesystem in that.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Saxon, Will
> -Original Message-
> From: zfs-discuss-boun...@opensolaris.org 
> [mailto:zfs-discuss-boun...@opensolaris.org] On Behalf Of Paul Kraus
> Sent: Wednesday, August 11, 2010 3:53 PM
> To: ZFS Discussions
> Subject: [zfs-discuss] ZFS and VMware
> 
>I am looking for references of folks using ZFS with either NFS
> or iSCSI as the backing store for VMware (4.x) backing store for
> virtual machines. We asked the local VMware folks and they had not
> even heard of ZFS. Part of what we are looking for is a recommendation
> for NFS or iSCSI, and all VMware would say is "we support both". We
> are currently using Sun SE-6920, 6140, and 2540 hardware arrays via
> FC. We have started playing with ZFS/NFS, but have no experience with
> iSCSI. The ZFS backing store in some cases will be the hardware arrays
> (the 6920 has fallen off of VMware's supported list and if we front
> end it with either NFS or iSCSI it'll be supported, and VMware
> suggested that) and some of it will be backed by J4400 SATA disk.
> 
>I have seen some discussion of this here, but it has all been
> related to very specific configurations and issues, I am looking for
> general recommendations and experiences. Thanks.
> 

It really depends on your VM system, what you plan on doing with VMs and how 
you plan to do it. 

I have the vSphere Enterprise product and I am using the DRS feature, so VMs 
are vmotioned around my cluster all throughout the day. All of my VM users are 
able to create and manage their own VMs through the vSphere client. None of 
them care to know anything about VM storage as long as it's fast, and most of 
them don't want to have to make choices about which datastore to put their new 
VM on. Only 30-40% of the total number of VMs registered in the cluster are 
powered on at any given time. 

I am using OpenSolaris and ZFS to provide a relatively small NFS datastore as a 
proof of concept. I am trying to demonstrate that it's a better solution for us 
than our existing solution, which is Windows Storage Server and the MS iSCSI 
Software Target. The ZFS-based datastore is hosted on six 146GB 10krpm SAS 
drives configured as a 3x2 mirror, a 30GB SSD as L2ARC and a 1GB ramdisk as the 
SLOG. Deduplication and compression (lzjb) are enabled. The server itself is a 
dual quad core core2-level system with 48GB RAM - it is going to be a VM host 
in the cluster after this project is concluded. 

Based on the experience and information I've gathered thus far, here is what I 
think:

The biggest thing for me is that I think I will be able to use deduplication, 
compression and a bunch of mirror vdevs using ZFS, whereas with other products 
I would need to use RAID 5 or 6 to get enough capacity with my budget. 
Larger/cheaper drives are also a possibility with ZFS since 
dedup/compression/ARC/L2ARC cuts down on IO to the disk. 

NFS Pros: NFS is much easier/faster to configure. Dedup and compression work 
better as the VM files sit directly on the filesystem. There is potential for 
faster provisioning by doing a local copy vs. having VMware do it remotely over 
NFS. It's nice to be able to get at each VM file directly from the fs as 
opposed to remotely via the vSphere client or service console (which will 
disappear with the next VMware release). You can use fast SSD SLOGs to 
accelerate most (all?) writes to the ZIL.

NFS Cons: VMware does not do NFSv4, so each of your filesystems will require a 
separate mount. There is a maximum number of mounts per cluster (64 with 
vSphere 4.0). There is no opportunity for load balancing between the client and 
a single datastore. VMware makes all writes synchronous writes, so you really 
need to have SLOGs (or a RAID controller with BBWC) to make the hosted VMs 
usable. VMware does not give you any vm-level disk performance statistics for 
VMs on an NFS datastore (at least, not through the client).

iSCSI Pros: COMSTAR rocks - you can set up your LUNs on an iSCSI target today, 
and move them to FC/FCoE/SRP tomorrow. Cloning zvols is fast, which could be 
leveraged for fast VM provisioning. iSCSI supports multipathing, so you can 
take advantage of VMware built-in NMP to do load balancing. You don't need a 
SLOG as much because you'll only have synchronous writes if the VM requests 
them.

iSCSI Cons: It's harder to take full advantage of dedup and compression. Basic 
configuration is not hard, but is still much more complicated than NFS. Other 
than that, the rest of the cons are all VMware related. vSphere has a limit of 
256 LUNs per host, which in a cluster supporting vmotion basically means 256 
LUNs per cluster. This limit may mean that cloning zvols to speed up VM 
provisioning is not possible. You can have multiple VMs per LUN using VMFS, but 
if you make LUNs too large you run into locking issues when provisioning - 
general wisdom is to k

Re: [zfs-discuss] ZFS and VMware

2010-08-11 Thread Simone Caldana
Hi Paul,

I am using EXSi 4.0 with a NFS-on-ZFS datastore running on OSOL b134. It 
previously ran on Solaris 10u7 with VMware Server 2.x. Disks are SATAs in a 
JBOD over FC.

I'll try to summarize my experience here, albeit our system does not provide 
services to end users and thus is not very stressed (it supports our internal 
developers only).

There are some things to know before setting up a ZFS storage, in my opinion. 
1. You cannot shrink a zpool. You can only detach disks from mirrors, not from 
raidz. You can grow it, tho, by either replacing disks with higher capacity 
ones or by adding more disks to the same pool.

2. Due to ZFS inherently coherent structure, synch writes (especially random) 
are its worst enemy: the cure is to bring a pair of mirrored SSDs in the pool 
or to use a battery backed write cache, especially if you want to use raidz.
3. VMware, regardless of NFS or iSCSI, will do synchronous writes. Due to point 
2 above if your workload and number of VM is significant you will definitely 
need some kind of disk device based on memory and not on platters. YMMV.

4. ZFS snapshotting is great but it can burn a sizeable amount of disk if you 
leave your VMs local disks mounted without noatime option (assuming they are 
unix machines) because vmdks will get written to even if the internal vm 
processes only issue reads on files. (in my case ~60 idle linux machines burned 
up to 200MB/hr, generating an average of 2MB/sec of writing traffic)

5. Avoid putting different size and performance disks in a zpool, unless of 
course they are doing a different job. ZFS doesn't weigh in size or 
performances and spreads out data evenly. A zpool will perform as the slowest 
of its members (not all the time, but when the workload is high the slowest 
disk will be a limiting factor.
6. ZFS performs badly on disks that are more than 80% full. Keep that in mind 
when you size up for things.

7. ZFS compression works wonders, especially the default one: it costs little 
in cpu, it doesn't increase latency (unless you have a very unbalanced 
CPU/Disks system) and thus saves space and bandwidth.
8. By mirroring/raiding things at OS level ZFS effectively multiplies the 
bandwidth used on the bus. My SATA disks can sustain writing 60MB/sec, but in a 
zpool made by 6 mirrors of 2 disks each that uses a 2Gbit fibre the maximum 
throughtput is ~95MB/sec: the fibre max out at 190MB/sec, but Solaris need to 
write to both disks on each mirror. You can partially solve this by putting 
each side of a mirror on different storages and/or increasing the number of 
paths towards the disks.

9. Deduplication is great on paper and can be wonderful in virtualized 
environments, but it has a BIG costs upfront: search around, do you math but be 
aware that you'll need tons of ram and SSDs to be able to effectively 
deduplicate multi terabyte storages. Also it is common opinion that's not ready 
for production.

If the analisys/tests of your use case tells you ZFS is a viable option I think 
you should give it a try. Administration is wonderfully flexible and easy: once 
you set up your zpools (which is the really critical phase) you can practically 
do anything you want in the most efficient way. So far I'm very pleased by it.

-- 
Simone Caldana
Senior Consultant
Critical Path
via Cuniberti 58, 10100 Torino, Italia
+39 011 4513811 (Direct)
+39 011 4513825 (Fax)
simone.cald...@criticalpath.net
http://www.cp.net/

Critical Path
A global leader in digital communications


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] ZFS and VMware

2010-08-11 Thread Paul Kraus
   I am looking for references of folks using ZFS with either NFS
or iSCSI as the backing store for VMware (4.x) backing store for
virtual machines. We asked the local VMware folks and they had not
even heard of ZFS. Part of what we are looking for is a recommendation
for NFS or iSCSI, and all VMware would say is "we support both". We
are currently using Sun SE-6920, 6140, and 2540 hardware arrays via
FC. We have started playing with ZFS/NFS, but have no experience with
iSCSI. The ZFS backing store in some cases will be the hardware arrays
(the 6920 has fallen off of VMware's supported list and if we front
end it with either NFS or iSCSI it'll be supported, and VMware
suggested that) and some of it will be backed by J4400 SATA disk.

   I have seen some discussion of this here, but it has all been
related to very specific configurations and issues, I am looking for
general recommendations and experiences. Thanks.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss