Re: [zfs-discuss] weird bug with Seagate 3TB USB3 drive
As a sanity check, I connected the drive to a Windows 7 installation. I was able to partition, create an NTFS volume on it, eject and remount it. I also tried creating the zpool on my Solaris 10 system, exporting and trying to import the pool on my Solaris 11X system and again no love. I'm baffled why zpool import is unable to find the pool on the drive, but the drive is definitely functional. John groenv...@acm.org ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
Here's the zpool layout. You don't really have a choice on the boot volume -- the system supports only two drives on the same chain. The remaining drives are as shown: pool: internal state: ONLINE scan: scrub repaired 0 in 0h0m with 0 errors on Sat Oct 8 21:15:42 2011 config: NAMESTATE READ WRITE CKSUM internalONLINE 0 0 0 raidz1-0 ONLINE 0 0 0 c3t0d0 ONLINE 0 0 0 c4t0d0 ONLINE 0 0 0 c6t0d0 ONLINE 0 0 0 c8t0d0 ONLINE 0 0 0 c9t0d0 ONLINE 0 0 0 raidz1-1 ONLINE 0 0 0 c3t1d0 ONLINE 0 0 0 c4t1d0 ONLINE 0 0 0 c6t1d0 ONLINE 0 0 0 c8t1d0 ONLINE 0 0 0 c9t1d0 ONLINE 0 0 0 raidz1-2 ONLINE 0 0 0 c3t2d0 ONLINE 0 0 0 c4t2d0 ONLINE 0 0 0 c6t2d0 ONLINE 0 0 0 c8t2d0 ONLINE 0 0 0 c9t2d0 ONLINE 0 0 0 raidz1-3 ONLINE 0 0 0 c3t3d0 ONLINE 0 0 0 c4t3d0 ONLINE 0 0 0 c6t3d0 ONLINE 0 0 0 c8t3d0 ONLINE 0 0 0 c9t3d0 ONLINE 0 0 0 raidz1-4 ONLINE 0 0 0 c3t4d0 ONLINE 0 0 0 c4t4d0 ONLINE 0 0 0 c6t4d0 ONLINE 0 0 0 c8t4d0 ONLINE 0 0 0 c9t4d0 ONLINE 0 0 0 raidz1-5 ONLINE 0 0 0 c3t5d0 ONLINE 0 0 0 c4t5d0 ONLINE 0 0 0 c6t5d0 ONLINE 0 0 0 c8t5d0 ONLINE 0 0 0 c9t5d0 ONLINE 0 0 0 raidz1-6 ONLINE 0 0 0 c3t6d0 ONLINE 0 0 0 c4t6d0 ONLINE 0 0 0 c6t6d0 ONLINE 0 0 0 c8t6d0 ONLINE 0 0 0 c9t6d0 ONLINE 0 0 0 raidz1-7 ONLINE 0 0 0 c3t7d0 ONLINE 0 0 0 c4t7d0 ONLINE 0 0 0 c6t7d0 ONLINE 0 0 0 c8t7d0 ONLINE 0 0 0 c9t7d0 ONLINE 0 0 0 logs c7t2d0ONLINE 0 0 0 c7t6d0ONLINE 0 0 0 spares c7t1d0AVAIL c7t5d0AVAIL c7t3d0 AVAIL pool: rpool state: ONLINE scan: scrub repaired 0 in 0h6m with 0 errors on Sat Oct 8 21:21:54 2011 config: NAME STATE READ WRITE CKSUM rpool ONLINE 0 0 0 mirror-0ONLINE 0 0 0 c7t0d0s0 ONLINE 0 0 0 c7t4d0s0 ONLINE 0 0 0 errors: No known data errors It appears I mis-spoke on the caches.I thought I'd used two drives for cache, but apparently not. And Solaris 11 is supposed to be out Real Soon Now. :-) On 10/14/11 02:54 PM, Jim Klimov wrote: 2011-10-14 23:57, Gregory Shaw пишет: You might want to keep in mind that the X4500 was a ~2006 box, and had only PCI-X slots. Or, at least, that's what the 3 Iv'e got have. I think the X4540 had PCIe, but I never got one of those. :-( I haven't seen any cache accelerator PCI-X cards. However, what I've done on the X4500 systems in the lab is to use two drives on the system disk bus for the cache and log devices (each). So you have 44 data drives, 2 os drives and 2 zil/cache devices? And what do you use for zil/cache? SSDs? Specific ones? With the 175 release of Solaris 11, I have literally seen a scrub running at 960mb/sec, and around 400mb/sec for 10ge NFS. Hmm, and where can you get that release in the open? ;) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
2011-10-14 23:57, Gregory Shaw пишет: You might want to keep in mind that the X4500 was a ~2006 box, and had only PCI-X slots. Or, at least, that's what the 3 Iv'e got have. I think the X4540 had PCIe, but I never got one of those. :-( I haven't seen any cache accelerator PCI-X cards. However, what I've done on the X4500 systems in the lab is to use two drives on the system disk bus for the cache and log devices (each). So you have 44 data drives, 2 os drives and 2 zil/cache devices? And what do you use for zil/cache? SSDs? Specific ones? With the 175 release of Solaris 11, I have literally seen a scrub running at 960mb/sec, and around 400mb/sec for 10ge NFS. Hmm, and where can you get that release in the open? ;) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
You might want to keep in mind that the X4500 was a ~2006 box, and had only PCI-X slots. Or, at least, that's what the 3 Iv'e got have. I think the X4540 had PCIe, but I never got one of those. :-( I haven't seen any cache accelerator PCI-X cards. However, what I've done on the X4500 systems in the lab is to use two drives on the system disk bus for the cache and log devices (each). With the 175 release of Solaris 11, I have literally seen a scrub running at 960mb/sec, and around 400mb/sec for 10ge NFS. On Oct 14, 2011, at 12:25 PM, Erik Trimble wrote: > On 10/14/2011 5:49 AM, Darren J Moffat wrote: >> On 10/14/11 13:39, Jim Klimov wrote: >>> Hello, I was asked if the CF port in Thumpers can be accessed by the OS? >>> In particular, would it be a good idea to use a modern 600x CF card >>> (some reliable one intended for professional photography) as an L2ARC >>> device using this port? >> >> I don't know about the Thumpers internal CF slot. >> >> I can say I have tried using a fast (at the time, this was about 3 years >> ago) CF card via a CF to IDE adaptor before and it turned out to be a really >> bad idea because the spinning rust disk (which was SATA) was actually faster >> to access. Same went for USB to CF adaptors at the time too. >> > > Last I'd checked, the CF port was fully functional. > > However, I'd not use it as L2ARC (and, certainly not ZIL). CF is not good > in terms of either random write or read - professional-grade CF cards are > optimized for STREAMING write - that is, the ability to write a big-ass JPG > or BMP or TIFF as quickly as possible. The CF controller isn't good on lots > of little read/write ops. > > In Casper's case, the CF->IDE adapter makes this even worse, since IDE is > spectacularly bad at IOPS. > > I can't remember - does the X4500 have any extra SATA ports free on the > motherboard? And, does it have any extra HD power connectors? > > http://www.amazon.com/dp/B002MWDRD6/ref=asc_df_B002MWDRD61280186?smid=A2YLYLTN75J8LR&tag=shopzilla_mp_1382-20&linkCode=asn&creative=395105&creativeASIN=B002MWDRD6 > > Is a great way to add a 2.5" drive slot, but it's just a physical slot > adapter - you need to attach a standard SATA cable and HD power connector to > it. > > > If that's not an option, find yourself a cheap PCI-E adapter with eSATA ports > on it, then use an external HD enclosure with eSATA for a small SSD. > > > As a last resort, remove one of the 3.5" SATA drives, and put in an SSD in a > 2.5"->3.5" converter enclosure. > > Remember, you can generally get by fine with a lower-end SSD as L2ARC, so a > 60GB SSD should be $100 or less. > > -Erik > > > > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss - Gregory Shaw, Enterprise IT Architect Phone: (303) 246-5411 Oracle Global IT Service Design Group 500 Eldorado Blvd, UBRM02-157 greg.s...@oracle.com (work) Broomfield, CO 80021 gr...@fmsoft.com (home) Hoping the problem magically goes away by ignoring it is the "microsoft approach to programming" and should never be allowed. (Linus Torvalds) ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
Jim Klimov wrote: Thanks, but I believe currently that's out of budget, but a 90MB/s CF module may be acceptable for the small business customer. I wondered if that is known to work or not... I've had a compact flash IDE drive not work in a white-box system. In that case it was a ufs root disk, but any attempt to put a serious load on it, and it corrupted data all over the place. So if you're going to try one, make sure you hammer it very hard in a test environment before you commit anything important to it. -- Andrew Gabriel ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
On 10/14/2011 5:49 AM, Darren J Moffat wrote: On 10/14/11 13:39, Jim Klimov wrote: Hello, I was asked if the CF port in Thumpers can be accessed by the OS? In particular, would it be a good idea to use a modern 600x CF card (some reliable one intended for professional photography) as an L2ARC device using this port? I don't know about the Thumpers internal CF slot. I can say I have tried using a fast (at the time, this was about 3 years ago) CF card via a CF to IDE adaptor before and it turned out to be a really bad idea because the spinning rust disk (which was SATA) was actually faster to access. Same went for USB to CF adaptors at the time too. Last I'd checked, the CF port was fully functional. However, I'd not use it as L2ARC (and, certainly not ZIL). CF is not good in terms of either random write or read - professional-grade CF cards are optimized for STREAMING write - that is, the ability to write a big-ass JPG or BMP or TIFF as quickly as possible. The CF controller isn't good on lots of little read/write ops. In Casper's case, the CF->IDE adapter makes this even worse, since IDE is spectacularly bad at IOPS. I can't remember - does the X4500 have any extra SATA ports free on the motherboard? And, does it have any extra HD power connectors? http://www.amazon.com/dp/B002MWDRD6/ref=asc_df_B002MWDRD61280186?smid=A2YLYLTN75J8LR&tag=shopzilla_mp_1382-20&linkCode=asn&creative=395105&creativeASIN=B002MWDRD6 Is a great way to add a 2.5" drive slot, but it's just a physical slot adapter - you need to attach a standard SATA cable and HD power connector to it. If that's not an option, find yourself a cheap PCI-E adapter with eSATA ports on it, then use an external HD enclosure with eSATA for a small SSD. As a last resort, remove one of the 3.5" SATA drives, and put in an SSD in a 2.5"->3.5" converter enclosure. Remember, you can generally get by fine with a lower-end SSD as L2ARC, so a 60GB SSD should be $100 or less. -Erik ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
2011-10-14 21:01, Jordan Schwartz пишет: Try and get your hands on a Sun F20 Card which has 4 x 25GB SSD Modules on a PCI card. Thanks, but I believe currently that's out of budget, but a 90MB/s CF module may be acceptable for the small business customer. I wondered if that is known to work or not... Also if you can get an X4540 system controller you can then use 64GB of RAM aka L1ARC. I am not sure I can get that (not until they become refurbished and cheap), but still: are the X4540 mainboards compatible with X4500 chassis? Is it possible to upgrade one into another? ;) Jordan On Fri, Oct 14, 2011 at 5:39 AM, Jim Klimov wrote: Hello, I was asked if the CF port in Thumpers can be accessed by the OS? In particular, would it be a good idea to use a modern 600x CF card (some reliable one intended for professional photography) as an L2ARC device using this port? Thanks, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea
Also, it's not worth doing a clustered ZFS thing that is too application-specific. You really want to nail down your choices of semantics, explore what design options those yield (or approach from the other direction, or both), and so on. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea
On Thu, Oct 13, 2011 at 9:13 PM, Jim Klimov wrote: > Thanks to Nico for concerns about POSIX locking. However, > hopefully, in the usecase I described - serving images of > VMs in a manner where storage, access and migration are > efficient - whole datasets (be it volumes or FS datasets) > can be dedicated to one VM host server at a time, just like > whole pools are dedicated to one host nowadays. In this > case POSIX compliance can be disregarded - access > is locked by one host, not avaialble to others, period. > Of course, there is a problem of capturing storage from > hosts which died, and avoiding corruptions - but this is > hopefully solved in the past decades of clustering tech's. It sounds to me like you need horizontal scaling more than anything else. In that case, why not use pNFS or Lustre? Even if you want snapshots, a VM should be able to handle that on its own, and though probably not as nicely as ZFS in some respects, having the application be in control of the exact snapshot boundaries does mean that you don't have to quiesce your VMs just to snapshot safely. > Nico also confirmed that "one node has to be a master of > all TXGs" - which is conveyed in both ideas of my original > post. Well, at any one time one node would have to be the master of the next TXG, but it doesn't mean that you couldn't have some cooperation. There are lots of other much more interesting questions. I think the biggest problem lies in requiring full connectivity from every server to every LUN. I'd much rather take the Lustre / pNFS model (which, incidentally, don't preclude having snapshots). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea
2011-10-14 19:33, Tim Cook пишет: With clustered VMFS on shared storage, VMWare can migrate VMs faster - it knows not to copy the HDD image file in vain - it will be equally available to the "new host" at the correct point in migration, just as it was accessible to the "old host". Again. NFS/iscsi/IB = ok. True, except that this is not an optimal solution in this described usecase - a farm of server blades with a relatively dumb fast raw storage (but NOT an intellectual storage server). //Jim The idea is you would dedicate one of the servers in the chassis to be a Solaris system, which then presents NFS out to the rest of the hosts. From the chassis itself you would present every drive that isn't being used to boot an existing server to this solaris host as individual disks, and let that server take care of RAID and presenting out the storage to the rests of the vmware hosts. --Tim Yes, I wrote of that as an option - but a relatively poor one (though now we're limited to do this). As I numerously wrote, major downsides are: * probably increased latency due to another added hop of processing delays, just as with extra switches and routers in networks; * probably reduced bandwidth of LAN as compared to direct disk access; certainly it won't get increased ;) Besides, the LAN may be (highly) utilized by servers running in VMs or physical blades, so storage traffic over LAN would compete with real networking and/or add to latencies. * in order for the whole chassis to provide HA services and run highly-available VMs, the storage servers have to be redundant - at least one other blade would have to be provisioned for failover ZFS import and serving for other nodes. This is not exactly a showstopper - but the "spare" blade would either have to not run VMs at all, or run not as many VMs as others, and in case of a pool failover event it would probably have to migrate its running VMs away in order to increase ARC and reduce storage latency for other servers. That's doable, and automatable, but a hassle nonetheless. Also I'm not certain how well other hosts can benefit from caching in their local RAMs when using NFS or iSCSI resources. I think they might benefit better from local ARCs in the pool were directly imported to each of them... Upsides are: * this already works, and reliably, as any other ZFS NAS solution. That's a certain "plus" :) In this current case one or two out of six blades should be dedicated to storage, leaving only 4 or 5 to VMs. In case of shared pools, there is a new problem of TXG-master failover to some other node (which would probably be not slower than a pool reimport is now), but otherwise all six servers' loads are balanced. And they only cache what they really need. And they have faster disk access times. And they don't use LAN superfluously for storage access. //Jim PS: Anyway, I wanted to say this earlier - thanks to everyone who responded, even (or especially) with criticism and requests for detalisation. If nothing else, you helped me describe my idea better and less ambigously, so that some other thinkers can decide whether and how to implement it ;) PPS: When I earlier asked about getting ZFS under the hood of RAID controllers, I guess I kinda wished to replace the black box of intel's firmware with a ZFS-aware OS (FreeBSD probably) - the storage controller modules must be some sort of computers running in a failover link... These SCMs would then export datasets as SAS LUNs to specific servers, like is done now, and possibly would not require clustered ZFS - but might benefit from it too. So my MFSYS illustration is partially relevant for that question as well... ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea
On Fri, Oct 14, 2011 at 7:36 AM, Jim Klimov wrote: > 2011-10-14 15:53, Edward Ned Harvey пишет: > > From: > zfs-discuss-bounces@**opensolaris.org[mailto: >>> zfs-discuss- >>> boun...@opensolaris.org] On Behalf Of Jim Klimov >>> >>> I guess Richard was correct about the usecase description - >>> I should detail what I'm thinking about, to give some illustration. >>> >> After reading all this, I'm still unclear on what you want to accomplish, >> that isn't already done today. Yes I understand what it means when we say >> ZFS is not a clustering filesystem, and yes I understand what benefits there >> would be to gain if it were a clustering FS. But in all of what you're >> saying below, I don't see that you need a clustering FS. >> > > In my example - probably not a completely clustered FS. > A clustered ZFS pool with datasets individually owned by > specific nodes at any given time would suffice for such > VM farms. This would give users the benefits of ZFS > (resilience, snapshots and clones, shared free space) > merged with the speed of direct disk access instead of > lagging through a storage server accessing these disks. > > This is why I think such a solution may be more simple > than a fully-fledged POSIX-compliant shared FS, but it > would still have some benefits for specific - and popular - > usage cases. And it might pave way for a more complete > solution - or perhaps illustrate what should not be done > for those solutions ;) > > After all, I think that if the problem of safe multiple-node > RW access to ZFS gets fundamentally solved, these > usages I described before might just become a couple > of new dataset types with specific predefined usage > and limitations - like POSIX-compliant FS datasets > and block-based volumes are now defined over ZFS. > There is no reason not to call them "clustered FS and > clustered volume datasets", for example ;) > > AFAIK, VMFS is not a generic filesystem, and cannot > quite be used "directly" by software applications, but it > has its target market for shared VM farming... > > I do not know how they solve the problems of consistency > control - with master nodes or something else, and for > the sake of patent un-encroaching, I'm afraid I'd rather > not know - as to not copycat someone's solution and > get burnt for that ;) > > > >> of these deployments become VMWare ESX farms with shared >>> VMFS. Due to my stronger love for things Solaris, I would love >>> to see ZFS and any of Solaris-based hypervisors (VBox, Xen >>> or KVM ports) running there instead. But for things to be as >>> efficient, ZFS would have to become shared - clustered... >>> >> I think the solution people currently use in this area is either NFS or >> iscsi. (Or infiniband, and other flavors.) You have a storage server >> presenting the storage to the various vmware (or whatever) hypervisors. >> > > In fact, no. Based on the MFSYS model, there is no storage server. > There is a built-in storage controller which can do RAID over HDDs > and represent SCSI LUNs to the blades over direct SAS access. > These LUNs can be accessed individually by certain servers, or > concurrently. In the latter case it is possible that servers take turns > mounting the LUN as a HDD with some single-server FS, or use > a clustered FS to use the LUN's disk space simultaneously. > > If we were to use in this system an OpenSolaris-based OS and > VirtualBox/Xen/KVM as they are now, and hope for live migration > of VMs without copying of data, we would have to make a separate > LUN for each VM on the controller, and mount/import this LUN to > its current running host. I don't need to explain why that would be > a clumsy and unflexible solution for a near-infinite number of > reasons, do i? ;) > > > Everything works. What's missing? And why does this need to be a >> clustering FS? >> >> >> To be clearer, I should say that modern VM hypervisors can >>> migrate running virtual machines between two VM hosts. >>> >> This works on NFS/iscsi/IB as well. Doesn't need a clustering FS. >> > Except that the storage controller doesn't do NFS/iscsi/IB, > and doesn't do snapshots and clones. And if I were to > dedicate one or two out of six blades to storage tasks, > this might be considered an improper waste of resources. > And would repackage SAS access (anyway available to > all blades at full bandwidth) into NFS/iscsi access over a > Gbit link... > > > >> >> With clustered VMFS on shared storage, VMWare can >>> migrate VMs faster - it knows not to copy the HDD image >>> file in vain - it will be equally available to the "new host" >>> at the correct point in migration, just as it was accessible >>> to the "old host". >>> >> Again. NFS/iscsi/IB = ok. >> > > True, except that this is not an optimal solution in this described > usecase - a farm of server blades with a relatively dumb fast raw > storage (but NOT an intellectual storage server). > > //Jim > The idea is you would dedicate one of the servers in the
Re: [zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
On 10/14/11 13:39, Jim Klimov wrote: Hello, I was asked if the CF port in Thumpers can be accessed by the OS? In particular, would it be a good idea to use a modern 600x CF card (some reliable one intended for professional photography) as an L2ARC device using this port? I don't know about the Thumpers internal CF slot. I can say I have tried using a fast (at the time, this was about 3 years ago) CF card via a CF to IDE adaptor before and it turned out to be a really bad idea because the spinning rust disk (which was SATA) was actually faster to access. Same went for USB to CF adaptors at the time too. -- Darren J Moffat ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
[zfs-discuss] Thumper (X4500), and CF SSD for L2ARC = ?
Hello, I was asked if the CF port in Thumpers can be accessed by the OS? In particular, would it be a good idea to use a modern 600x CF card (some reliable one intended for professional photography) as an L2ARC device using this port? Thanks, //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea
2011-10-14 15:53, Edward Ned Harvey пишет: From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- boun...@opensolaris.org] On Behalf Of Jim Klimov I guess Richard was correct about the usecase description - I should detail what I'm thinking about, to give some illustration. After reading all this, I'm still unclear on what you want to accomplish, that isn't already done today. Yes I understand what it means when we say ZFS is not a clustering filesystem, and yes I understand what benefits there would be to gain if it were a clustering FS. But in all of what you're saying below, I don't see that you need a clustering FS. In my example - probably not a completely clustered FS. A clustered ZFS pool with datasets individually owned by specific nodes at any given time would suffice for such VM farms. This would give users the benefits of ZFS (resilience, snapshots and clones, shared free space) merged with the speed of direct disk access instead of lagging through a storage server accessing these disks. This is why I think such a solution may be more simple than a fully-fledged POSIX-compliant shared FS, but it would still have some benefits for specific - and popular - usage cases. And it might pave way for a more complete solution - or perhaps illustrate what should not be done for those solutions ;) After all, I think that if the problem of safe multiple-node RW access to ZFS gets fundamentally solved, these usages I described before might just become a couple of new dataset types with specific predefined usage and limitations - like POSIX-compliant FS datasets and block-based volumes are now defined over ZFS. There is no reason not to call them "clustered FS and clustered volume datasets", for example ;) AFAIK, VMFS is not a generic filesystem, and cannot quite be used "directly" by software applications, but it has its target market for shared VM farming... I do not know how they solve the problems of consistency control - with master nodes or something else, and for the sake of patent un-encroaching, I'm afraid I'd rather not know - as to not copycat someone's solution and get burnt for that ;) of these deployments become VMWare ESX farms with shared VMFS. Due to my stronger love for things Solaris, I would love to see ZFS and any of Solaris-based hypervisors (VBox, Xen or KVM ports) running there instead. But for things to be as efficient, ZFS would have to become shared - clustered... I think the solution people currently use in this area is either NFS or iscsi. (Or infiniband, and other flavors.) You have a storage server presenting the storage to the various vmware (or whatever) hypervisors. In fact, no. Based on the MFSYS model, there is no storage server. There is a built-in storage controller which can do RAID over HDDs and represent SCSI LUNs to the blades over direct SAS access. These LUNs can be accessed individually by certain servers, or concurrently. In the latter case it is possible that servers take turns mounting the LUN as a HDD with some single-server FS, or use a clustered FS to use the LUN's disk space simultaneously. If we were to use in this system an OpenSolaris-based OS and VirtualBox/Xen/KVM as they are now, and hope for live migration of VMs without copying of data, we would have to make a separate LUN for each VM on the controller, and mount/import this LUN to its current running host. I don't need to explain why that would be a clumsy and unflexible solution for a near-infinite number of reasons, do i? ;) Everything works. What's missing? And why does this need to be a clustering FS? To be clearer, I should say that modern VM hypervisors can migrate running virtual machines between two VM hosts. This works on NFS/iscsi/IB as well. Doesn't need a clustering FS. Except that the storage controller doesn't do NFS/iscsi/IB, and doesn't do snapshots and clones. And if I were to dedicate one or two out of six blades to storage tasks, this might be considered an improper waste of resources. And would repackage SAS access (anyway available to all blades at full bandwidth) into NFS/iscsi access over a Gbit link... With clustered VMFS on shared storage, VMWare can migrate VMs faster - it knows not to copy the HDD image file in vain - it will be equally available to the "new host" at the correct point in migration, just as it was accessible to the "old host". Again. NFS/iscsi/IB = ok. True, except that this is not an optimal solution in this described usecase - a farm of server blades with a relatively dumb fast raw storage (but NOT an intellectual storage server). //Jim ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] Wanted: sanity check for a clustered ZFS idea
> From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Jim Klimov > > I guess Richard was correct about the usecase description - > I should detail what I'm thinking about, to give some illustration. After reading all this, I'm still unclear on what you want to accomplish, that isn't already done today. Yes I understand what it means when we say ZFS is not a clustering filesystem, and yes I understand what benefits there would be to gain if it were a clustering FS. But in all of what you're saying below, I don't see that you need a clustering FS. > of these deployments become VMWare ESX farms with shared > VMFS. Due to my stronger love for things Solaris, I would love > to see ZFS and any of Solaris-based hypervisors (VBox, Xen > or KVM ports) running there instead. But for things to be as > efficient, ZFS would have to become shared - clustered... I think the solution people currently use in this area is either NFS or iscsi. (Or infiniband, and other flavors.) You have a storage server presenting the storage to the various vmware (or whatever) hypervisors. Everything works. What's missing? And why does this need to be a clustering FS? > To be clearer, I should say that modern VM hypervisors can > migrate running virtual machines between two VM hosts. This works on NFS/iscsi/IB as well. Doesn't need a clustering FS. > With clustered VMFS on shared storage, VMWare can > migrate VMs faster - it knows not to copy the HDD image > file in vain - it will be equally available to the "new host" > at the correct point in migration, just as it was accessible > to the "old host". Again. NFS/iscsi/IB = ok. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss