Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, Apr 25, 2012 at 8:57 PM, Paul Kraus wrote: > On Wed, Apr 25, 2012 at 9:07 PM, Nico Williams wrote: >> Nothing's changed. Automounter + data migration -> rebooting clients >> (or close enough to rebooting). I.e., outage. > > Uhhh, not if you design your automounter architecture correctly > and (as Richard said) have NFS clients that are not lame to which I'll > add, automunters that actually work as advertised. I was designing > automount architectures that permitted dynamic changes with minimal to > no outages in the late 1990's. I only had a little over 100 clients > (most of which were also servers) and NIS+ (NIS ver. 3) to distribute > the indirect automount maps. Further below you admit that you're talking about read-only data, effectively. But the world is not static. Sure, *code* is by and large static, and indeed, we segregated data by whether it was read-only (code, historical data) or not (application data, home directories). We were able to migrated *read-only* data with no outages. But for the rest? Yeah, there were always outages. Of course, we had a periodic maintenance window, with all systems rebooting within a short period, and this meant that some data migration outages were not noticeable, but they were real. > I also had to _redesign_ a number of automount strategies that > were built by people who thought that using direct maps for everything > was a good idea. That _was_ a pain in the a** due to the changes > needed at the applications to point at a different hierarchy. We used indirect maps almost exclusively. Moreover, we used hierarchical automount entries, and even -autofs mounts. We also used environment variables to control various things, such as which servers to mount what from (this was particularly useful for spreading the load on read-only static data). We used practically every feature of the automounter except for executable maps (and direct maps, when we eventually stopped using those). > It all depends on _what_ the application is doing. Something that > opens and locks a file and never releases the lock or closes the file > until the application exits will require a restart of the application > with an automounter / NFS approach. No kidding! In the real world such applications exist and get used. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, Apr 25, 2012 at 9:07 PM, Nico Williams wrote: > On Wed, Apr 25, 2012 at 7:37 PM, Richard Elling > wrote: >> On Apr 25, 2012, at 3:36 PM, Nico Williams wrote: >> > I disagree vehemently. automount is a disaster because you need to >> > synchronize changes with all those clients. That's not realistic. >> >> Really? I did it with NIS automount maps and 600+ clients back in 1991. >> Other than the obvious problems with open files, has it gotten worse since >> then? > > Nothing's changed. Automounter + data migration -> rebooting clients > (or close enough to rebooting). I.e., outage. Uhhh, not if you design your automounter architecture correctly and (as Richard said) have NFS clients that are not lame to which I'll add, automunters that actually work as advertised. I was designing automount architectures that permitted dynamic changes with minimal to no outages in the late 1990's. I only had a little over 100 clients (most of which were also servers) and NIS+ (NIS ver. 3) to distribute the indirect automount maps. I also had to _redesign_ a number of automount strategies that were built by people who thought that using direct maps for everything was a good idea. That _was_ a pain in the a** due to the changes needed at the applications to point at a different hierarchy. It all depends on _what_ the application is doing. Something that opens and locks a file and never releases the lock or closes the file until the application exits will require a restart of the application with an automounter / NFS approach. -- {1-2-3-4-5-6-7-} Paul Kraus -> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ ) -> Sound Coordinator, Schenectady Light Opera Company ( http://www.sloctheater.org/ ) -> Technical Advisor, RPI Players ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, Apr 25, 2012 at 7:37 PM, Richard Elling wrote: > On Apr 25, 2012, at 3:36 PM, Nico Williams wrote: > > I disagree vehemently. automount is a disaster because you need to > > synchronize changes with all those clients. That's not realistic. > > Really? I did it with NIS automount maps and 600+ clients back in 1991. > Other than the obvious problems with open files, has it gotten worse since > then? Nothing's changed. Automounter + data migration -> rebooting clients (or close enough to rebooting). I.e., outage. > Storage migration is much more difficult with NFSv2, NFSv3, NetWare, etc. But not with AFS. And spec-wise not with NFSv4 (though I don't know if/when all NFSv4 clients will properly support migration, just that the protocol and some servers do). > With server-side, referral-based namespace construction that problem > goes away, and the whole thing can be transparent w.r.t. migrations. Yes. > Agree, but we didn't have NFSv4 back in 1991 :-) Today, of course, this > is how one would design it if you had to design a new DFS today. Indeed, that's why I built an automounter solution in 1996 (that's still in use, I'm told). Although to be fair AFS existed back then and had global namespace and data migration back then, and was mature. It's taken NFS that long to catch up... > >[...] > > Almost any of the popular nosql databases offer this and more. > The movement away from POSIX-ish DFS and storing data in > traditional "files" is inevitable. Even ZFS is a object store at its core. I agree. Except that there are applications where large octet streams are needed. HPC, media come to mind. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Apr 25, 2012, at 3:36 PM, Nico Williams wrote: > On Wed, Apr 25, 2012 at 5:22 PM, Richard Elling > wrote: >> Unified namespace doesn't relieve you of 240 cross-mounts (or equivalents). >> FWIW, >> automounters were invented 20+ years ago to handle this in a nearly seamless >> manner. >> Today, we have DFS from Microsoft and NFS referrals that almost eliminate >> the need >> for automounter-like solutions. > > I disagree vehemently. automount is a disaster because you need to > synchronize changes with all those clients. That's not realistic. Really? I did it with NIS automount maps and 600+ clients back in 1991. Other than the obvious problems with open files, has it gotten worse since then? > I've built a large automount-based namespace, replete with a > distributed configuration system for setting the environment variables > available to the automounter. I can tell you this: the automounter > does not scale, and it certainly does not avoid the need for outages > when storage migrates. Storage migration is much more difficult with NFSv2, NFSv3, NetWare, etc. > With server-side, referral-based namespace construction that problem > goes away, and the whole thing can be transparent w.r.t. migrations. Agree, but we didn't have NFSv4 back in 1991 :-) Today, of course, this is how one would design it if you had to design a new DFS today. > > For my money the key features a DFS must have are: > > - server-driven namespace construction > - data migration without having to restart clients, > reconfigure them, or do anything at all to them > - aggressive caching > > - striping of file data for HPC and media environments > > - semantics that ultimately allow multiple processes > on disparate clients to cooperate (i.e., byte range > locking), but I don't think full POSIX semantics are > needed Almost any of the popular nosql databases offer this and more. The movement away from POSIX-ish DFS and storing data in traditional "files" is inevitable. Even ZFS is a object store at its core. > (that said, I think O_EXCL is necessary, and it'd be > very nice to have O_APPEND, though the latter is > particularly difficult to implement and painful when > there's contention if you stripe file data across > multiple servers) +1 -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, Apr 25, 2012 at 5:22 PM, Richard Elling wrote: > Unified namespace doesn't relieve you of 240 cross-mounts (or equivalents). > FWIW, > automounters were invented 20+ years ago to handle this in a nearly seamless > manner. > Today, we have DFS from Microsoft and NFS referrals that almost eliminate > the need > for automounter-like solutions. I disagree vehemently. automount is a disaster because you need to synchronize changes with all those clients. That's not realistic. I've built a large automount-based namespace, replete with a distributed configuration system for setting the environment variables available to the automounter. I can tell you this: the automounter does not scale, and it certainly does not avoid the need for outages when storage migrates. With server-side, referral-based namespace construction that problem goes away, and the whole thing can be transparent w.r.t. migrations. For my money the key features a DFS must have are: - server-driven namespace construction - data migration without having to restart clients, reconfigure them, or do anything at all to them - aggressive caching - striping of file data for HPC and media environments - semantics that ultimately allow multiple processes on disparate clients to cooperate (i.e., byte range locking), but I don't think full POSIX semantics are needed (that said, I think O_EXCL is necessary, and it'd be very nice to have O_APPEND, though the latter is particularly difficult to implement and painful when there's contention if you stripe file data across multiple servers) Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
2:34pm, Rich Teer wrote: On Wed, 25 Apr 2012, Paul Archer wrote: Simple. With a distributed FS, all nodes mount from a single DFS. With NFS, each node would have to mount from each other node. With 16 nodes, that's what, 240 mounts? Not to mention your data is in 16 different mounts/directory structures, instead of being in a unified filespace. Perhaps I'm being overly simplistic, but in this scenario, what would prevent one from having, on a single file server, /exports/nodes/node[0-15], and then having each node NFS-mount /exports/nodes from the server? Much simplier than your example, and all data is available on all machines/nodes. That assumes the data set will fit on one machine, and that machine won't be a performance bottleneck. ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Apr 25, 2012, at 2:26 PM, Paul Archer wrote: > 2:20pm, Richard Elling wrote: > >> On Apr 25, 2012, at 12:04 PM, Paul Archer wrote: >> >>Interesting, something more complex than NFS to avoid the >> complexities of NFS? ;-) >> >> We have data coming in on multiple nodes (with local storage) that is >> needed on other multiple nodes. The only way >> to do that with NFS would be with a matrix of cross mounts that would >> be truly scary. >> Ignoring lame NFS clients, how is that architecture different than what you >> would have >> with any other distributed file system? If all nodes share data to all other >> nodes, then...? >> -- richard >> > > Simple. With a distributed FS, all nodes mount from a single DFS. With NFS, > each node would have to mount from each other node. With 16 nodes, that's > what, 240 mounts? Not to mention your data is in 16 different > mounts/directory structures, instead of being in a unified filespace. Unified namespace doesn't relieve you of 240 cross-mounts (or equivalents). FWIW, automounters were invented 20+ years ago to handle this in a nearly seamless manner. Today, we have DFS from Microsoft and NFS referrals that almost eliminate the need for automounter-like solutions. Also, it is not unusual for a NFS environment to have 10,000+ mounts with thousands of mounts on each server. No big deal, happens every day. On Apr 25, 2012, at 2:53 PM, Nico Williams wrote: > To be fair NFSv4 now has a distributed namespace scheme so you could > still have a single mount on the client. That said, some DFSes have > better properties, such as striping of data across sets of servers, > aggressive caching, and various choices of semantics (e.g., Lustre > tries hard to give you POSIX cache coherency semantics). I think this is where the real value is. NFS & CIFS are intentionally generic and have caching policies that are favorably described as generic. For special-purpose workloads there can be advantages to having policies more explicitly applicable to the workload. -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, 25 Apr 2012, Rich Teer wrote: Perhaps I'm being overly simplistic, but in this scenario, what would prevent one from having, on a single file server, /exports/nodes/node[0-15], and then having each node NFS-mount /exports/nodes from the server? Much simplier than your example, and all data is available on all machines/nodes. This solution would limit bandwidth to that available from that single server. With the cluster approach, the objective is for each machine in the cluster to primarily access files which are stored locally. Whole files could be moved as necessary. Bob -- Bob Friesenhahn bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/ GraphicsMagick Maintainer,http://www.GraphicsMagick.org/ ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, Apr 25, 2012 at 4:26 PM, Paul Archer wrote: > 2:20pm, Richard Elling wrote: >> Ignoring lame NFS clients, how is that architecture different than what >> you would have >> with any other distributed file system? If all nodes share data to all >> other nodes, then...? > > Simple. With a distributed FS, all nodes mount from a single DFS. With NFS, > each node would have to mount from each other node. With 16 nodes, that's > what, 240 mounts? Not to mention your data is in 16 different > mounts/directory structures, instead of being in a unified filespace. To be fair NFSv4 now has a distributed namespace scheme so you could still have a single mount on the client. That said, some DFSes have better properties, such as striping of data across sets of servers, aggressive caching, and various choices of semantics (e.g., Lustre tries hard to give you POSIX cache coherency semantics). Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Wed, 25 Apr 2012, Paul Archer wrote: Simple. With a distributed FS, all nodes mount from a single DFS. With NFS, each node would have to mount from each other node. With 16 nodes, that's what, 240 mounts? Not to mention your data is in 16 different mounts/directory structures, instead of being in a unified filespace. Perhaps I'm being overly simplistic, but in this scenario, what would prevent one from having, on a single file server, /exports/nodes/node[0-15], and then having each node NFS-mount /exports/nodes from the server? Much simplier than your example, and all data is available on all machines/nodes. -- Rich Teer, Publisher Vinylphile Magazine www.vinylphilemag.com ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
2:20pm, Richard Elling wrote: On Apr 25, 2012, at 12:04 PM, Paul Archer wrote: Interesting, something more complex than NFS to avoid the complexities of NFS? ;-) We have data coming in on multiple nodes (with local storage) that is needed on other multiple nodes. The only way to do that with NFS would be with a matrix of cross mounts that would be truly scary. Ignoring lame NFS clients, how is that architecture different than what you would have with any other distributed file system? If all nodes share data to all other nodes, then...? -- richard Simple. With a distributed FS, all nodes mount from a single DFS. With NFS, each node would have to mount from each other node. With 16 nodes, that's what, 240 mounts? Not to mention your data is in 16 different mounts/directory structures, instead of being in a unified filespace.___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
On Apr 25, 2012, at 12:04 PM, Paul Archer wrote: > 11:26am, Richard Elling wrote: > >> On Apr 25, 2012, at 10:59 AM, Paul Archer wrote: >> >> The point of a clustered filesystem was to be able to spread our data >> out among all nodes and still have access >> from any node without having to run NFS. Size of the data set (once you >> get past the point where you can replicate >> it on each node) is irrelevant. >> Interesting, something more complex than NFS to avoid the complexities of >> NFS? ;-) > We have data coming in on multiple nodes (with local storage) that is needed > on other multiple nodes. The only way to do that with NFS would be with a > matrix of cross mounts that would be truly scary. Ignoring lame NFS clients, how is that architecture different than what you would have with any other distributed file system? If all nodes share data to all other nodes, then...? -- richard -- ZFS Performance and Training richard.ell...@richardelling.com +1-760-896-4422 ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
And he will still need an underlying filesystem like ZFS for them :) > -Original Message- > From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss- > boun...@opensolaris.org] On Behalf Of Nico Williams > Sent: 25 April 2012 20:32 > To: Paul Archer > Cc: ZFS-Discuss mailing list > Subject: Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD) > > I agree, you need something like AFS, Lustre, or pNFS. And/or an NFS proxy > to those. > > Nico > -- > ___ > zfs-discuss mailing list > zfs-discuss@opensolaris.org > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
Re: [zfs-discuss] cluster vs nfs (was: Re: ZFS on Linux vs FreeBSD)
I agree, you need something like AFS, Lustre, or pNFS. And/or an NFS proxy to those. Nico -- ___ zfs-discuss mailing list zfs-discuss@opensolaris.org http://mail.opensolaris.org/mailman/listinfo/zfs-discuss