Re: [zfs-discuss] One dataset per user?

2010-07-28 Thread Juergen Nickelsen
Edward Ned Harvey solar...@nedharvey.com writes:

 There are legitimate specific reasons to use separate filesystems
 in some circumstances. But if you can't name one reason why it's
 better ... then it's not better for you.

Having separate filesystems per user lets you create user-specific
quotas and reservations, lets you allow users to make their own
snapshots, and lets you do zfs send/recv replication of single user
home directories (for backup or move to another pool), and even
allow the users to do that on their own.

-- 
Usenet is not a right. It is a right, a left, a jab, and a sharp
uppercut to the jaw. The postman hits! You have new mail.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-24 Thread Paul B. Henson
On Tue, 22 Jun 2010, Arne Jansen wrote:

 We found that the zfs utility is very inefficient as it does a lot of
 unnecessary and costly checks.

Hmm, presumably somebody at Sun doesn't agree with that assessment or you'd
think they'd take them out :).

Mounting/sharing by hand outside of the zfs framework does make a huge
difference. It takes about 45 minutes to mount/share or unshare/unmount
with the mountpoint and sharenfs zfs properties set, mounting/sharing by
hand with SHARE_NOINUSE_CHECK=1 even just sequentially only took about 2
minutes. With some parallelization I could definitely see hitting that 10
seconds you mentioned, which would sure make my patch windows a hell of a
lot shorter. I'll need put together a script and fiddle some with smf, joy
oh joy, I need these filesystems mounted before the web server starts.

Thanks much for the tip!

I'm hoping someday they'll clean up the sharing implementation and make it
a bit more scalable. I had a ticket open once and they pretty much said it
would never happen for Solaris 10, but maybe sometime in the indefinite
future for OpenSolaris...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-22 Thread Paul B. Henson
On Sun, 20 Jun 2010, Arne Jansen wrote:

 In my experience the boot time mainly depends on the number of datasets,
 not the number of snapshots. 200 datasets is fairly easy (we have 7000,
 but did some boot-time tuning).

What kind of boot tuning are you referring to? We've got about 8k
filesystems on an x4500, it takes about 2 hours for a full boot cycle which
is kind of annoying. The majority of that time is taken up with NFS
sharing, which currently scales very poorly :(.

Thanks...


-- 
Paul B. Henson  |  (909) 979-6361  |  http://www.csupomona.edu/~henson/
Operating Systems and Network Analyst  |  hen...@csupomona.edu
California State Polytechnic University  |  Pomona CA 91768
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-22 Thread Arne Jansen

Paul B. Henson wrote:

On Sun, 20 Jun 2010, Arne Jansen wrote:


In my experience the boot time mainly depends on the number of datasets,
not the number of snapshots. 200 datasets is fairly easy (we have 7000,
but did some boot-time tuning).


What kind of boot tuning are you referring to? We've got about 8k
filesystems on an x4500, it takes about 2 hours for a full boot cycle which
is kind of annoying. The majority of that time is taken up with NFS
sharing, which currently scales very poorly :(.


As you said most of the time is spent for nfs sharing, but mounting also isn't
as fast as it could be. We found that the zfs utility is very inefficient as
it does a lot of unnecessary and costly checks. We set mountpoint to legacy
and handle mounting/sharing ourselves in a massively parallel fashion (50
processes). Using the system utilities makes things a lot better, but you
can speed up sharing a lot more by setting the SHARE_NOINUSE_CHECK environment
variable before invoking share(1M). With this you should be able to share your
tree in about 10 seconds.

Good luck,
Arne



Thanks...




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-22 Thread Arne Jansen

Arne Jansen wrote:

Paul B. Henson wrote:

On Sun, 20 Jun 2010, Arne Jansen wrote:


In my experience the boot time mainly depends on the number of datasets,
not the number of snapshots. 200 datasets is fairly easy (we have 7000,
but did some boot-time tuning).


What kind of boot tuning are you referring to? We've got about 8k
filesystems on an x4500, it takes about 2 hours for a full boot cycle 
which

is kind of annoying. The majority of that time is taken up with NFS
sharing, which currently scales very poorly :(.


As you said most of the time is spent for nfs sharing, but mounting also 
isn't
as fast as it could be. We found that the zfs utility is very 
inefficient as

it does a lot of unnecessary and costly checks. We set mountpoint to legacy
and handle mounting/sharing ourselves in a massively parallel fashion (50
processes). Using the system utilities makes things a lot better, but you
can speed up sharing a lot more by setting the SHARE_NOINUSE_CHECK 
environment
variable before invoking share(1M). With this you should be able to 
share your

tree in about 10 seconds.


I forgot the disclaimer: you can crash your machine if you call share with
improper arguments if you set this flag. iirc it skips a check if the fs
is already shared, so it cannot handle a re-share properly.



Good luck,
Arne



Thanks...




___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Roy Sigurd Karlsbakk
- Original Message -
 On Jun 20, 2010, at 11:55, Roy Sigurd Karlsbakk wrote:
 
  There will also be a few common areas for each department and
  perhaps a backup area.
 
 The back up area should be on a different set of disks.
 
 IMHO, a back up isn't a back up unless it is an /independent/ copy of
 the data. The copy can be made via ZFS send/recv, tar, rsync, Legato/
 NetBackup, etc., but it needs to be on independent media. Otherwise,
 if the original copy goes, so does the backup.

I think you misunderstand me here. The backup area will be a storage area for 
Ahsay (see http://www.ahsay.com/ ) for client and application (Oracle, Sybase, 
Exchange etc). All datasets will be copied to a secondary node either with ZFS 
Send/receive or (more probably) NexentaStore HA Cluster (   http://kurl.no/KzHU 
).

  I have read people are having problems with lengthy boot times with
  lots of datasets. We're planning to do extensive snapshotting on
  this system, so there might be close to a hundred snapshots per
  dataset, perhaps more. With 200 users and perhaps 10-20 shared
  department datasets, the number of filesystems, snapshots included,
  will be around 20k or more.
 
 You may also want to consider breaking things up into different pools
 as well. There seems to be an implicit assumption in this conversation
 that everything will be in one pool, and that may not be the best
 course of action.
 
 Perhaps one pool for users' homedirs, and another for the departmental
 stuff? Or perhaps even two different pools for homedirs, with users
 'randomly' distributed between the two (though definitely don't do
 something like alphabetical (it'll be non-even) or departmental
 (people transfer) distribution).
 
 This could add a bit of overhead, but I don't think have 2 or 3 pools
 would be much more of a big deal than one.

So far the plan is to keep it in one pool for design and administration 
simplicity. Why would you want to split up (net) 40TB into more pools? Seems to 
me that'll mess up things a bit, having to split up SSDs for use on different 
pools, loosing the flexibility of a common pool etc. Why?
 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Roy Sigurd Karlsbakk
 Btw, what did you plan to use as L2ARC/slog?

I was thinking of using four Crucial RealSSD 256MB SSDs with a small RAID1+0 
for SLOG and the rest for L2ARC. The system will be mainly used for reads, so I 
don't think the SLOG needs will be too tough. If you have another suggestion, 
please tell :)

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.

-- 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread David Magda

On Jun 21, 2010, at 05:00, Roy Sigurd Karlsbakk wrote:

So far the plan is to keep it in one pool for design and  
administration simplicity. Why would you want to split up (net) 40TB  
into more pools? Seems to me that'll mess up things a bit, having to  
split up SSDs for use on different pools, loosing the flexibility of  
a common pool etc. Why?


If different groups or areas have different I/O characteristics for  
one. If in one case (users) you want responsiveness, you could go with  
striped-mirrors. However, if departments have lots of data, it may be  
worthwhile to put it on a RAID-Z pool for better storage efficiency.


Just a thought.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Roy Sigurd Karlsbakk
- Original Message -
 On Jun 21, 2010, at 05:00, Roy Sigurd Karlsbakk wrote:
 
  So far the plan is to keep it in one pool for design and
  administration simplicity. Why would you want to split up (net) 40TB
  into more pools? Seems to me that'll mess up things a bit, having to
  split up SSDs for use on different pools, loosing the flexibility of
  a common pool etc. Why?
 
 If different groups or areas have different I/O characteristics for
 one. If in one case (users) you want responsiveness, you could go with
 striped-mirrors. However, if departments have lots of data, it may be
 worthwhile to put it on a RAID-Z pool for better storage efficiency.

We have considered RAID-1+0 and concluded with no current needs for this, as of 
now. Close to 1TB SSD cache will also help to boost read speeds, so I think it 
will be sufficient, at least for now. About different I/O characteristics in 
different groups/areas, this is not something we have data on for  now. Do you 
know a good way to check this? The data is located on two different zpools 
(sol10) today.

Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Arne Jansen
David Magda wrote:
 On Jun 21, 2010, at 05:00, Roy Sigurd Karlsbakk wrote:
 
 So far the plan is to keep it in one pool for design and
 administration simplicity. Why would you want to split up (net) 40TB
 into more pools? Seems to me that'll mess up things a bit, having to
 split up SSDs for use on different pools, loosing the flexibility of a
 common pool etc. Why?
 
 If different groups or areas have different I/O characteristics for one.
 If in one case (users) you want responsiveness, you could go with
 striped-mirrors. However, if departments have lots of data, it may be
 worthwhile to put it on a RAID-Z pool for better storage efficiency.
 

Especially if the characteristics are different I find it a good idea
to mix all on one set of spindles. This way you have lots of spindles
for fast access and lots of space for the sake of space. If you devide
the available spindles in two sets you will have much fewer spindles
available for the responsiveness goal. I don't think taking them into
a mirror can compensate that.

--Arne
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Edward Ned Harvey
 From: James C. McPherson [mailto:j...@opensolaris.org]
 
 On the build systems that I maintain inside the firewall,
 we mandate one filesystem per user, which is a very great
 boon for system administration. 

What's the reasoning behind it?


 My management scripts are
 considerably faster running when I don't have to traverse
 whole directory trees (ala ufs).

That's a good reason.  Why would you have to traverse whole directory 
structures if you had a single zfs filesystem in a single zpool, instead of 
many zfs filesystems in a single zpool?

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
 
 Close to 1TB SSD cache will also help to boost read
 speeds, 

Remember, this will not boost large sequential reads.  (Could possibly maybe 
even hurt it.)  This will only boost random reads.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread James C. McPherson

On 21/06/10 10:38 PM, Edward Ned Harvey wrote:

From: James C. McPherson [mailto:j...@opensolaris.org]

On the build systems that I maintain inside the firewall,
we mandate one filesystem per user, which is a very great
boon for system administration.


What's the reasoning behind it?


Politeness, basically. Every user on these machines is expected
to make and use their own disk-space sandpit - having their own
dataset makes that work nicely.


My management scripts are
considerably faster running when I don't have to traverse
whole directory trees (ala ufs).


That's a good reason.  Why would you have to traverse whole

 directory structures if you had a single zfs filesystem in
 a single zpool, instead of many zfs filesystems in a single zpool?

For instance, if I've got users a, b and c, who have their own
datasets, and users z, y and x who do not:

df -h /builds/[abczyx]

will show me disk usage of /builds for z, y and x, but

/builds/a
/builds/b
/builds/c

for the ones who do have their own dataset. So when I'm
trying to figure out who I need to yell at because they're
using more than our acceptable limit (30Gb), I have to run
du -s /builds/[zyx]. And that takes time. Lots of time.
Especially on these systems which are in huge demand from
people all over Solaris-land.



James C. McPherson
--
Senior Software Engineer, Solaris
Oracle
http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Darren J Moffat

On 21/06/2010 13:59, James C. McPherson wrote:

On 21/06/10 10:38 PM, Edward Ned Harvey wrote:

From: James C. McPherson [mailto:j...@opensolaris.org]

On the build systems that I maintain inside the firewall,
we mandate one filesystem per user, which is a very great
boon for system administration.


What's the reasoning behind it?


Politeness, basically. Every user on these machines is expected
to make and use their own disk-space sandpit - having their own
dataset makes that work nicely.


Plus it allows delegation of snapshot/clone/send/recv to the users on 
certain systems.


--
Darren J Moffat
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Fredrich Maney
On Mon, Jun 21, 2010 at 8:59 AM, James C. McPherson
j...@opensolaris.org wrote:
[...]
 So when I'm
 trying to figure out who I need to yell at because they're
 using more than our acceptable limit (30Gb), I have to run
 du -s /builds/[zyx]. And that takes time. Lots of time.
[...]

Why not just use quotas?

fpsm
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Bob Friesenhahn

On Mon, 21 Jun 2010, Edward Ned Harvey wrote:


From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk

Close to 1TB SSD cache will also help to boost read speeds,


Remember, this will not boost large sequential reads.  (Could 
possibly maybe even hurt it.)  This will only boost random reads.


Or more accurately, it boosts repeated reads.  It won't help much in 
the case where data is accessed only once.  It is basically a 
poor-man's substitute for caching data in RAM.  The RAM is at least 
20X faster so the system should be stuffed with RAM first as long as 
the budget can afford it.


Luckily, most servers experience mostly repeated reads.

Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Bob Friesenhahn

On Mon, 21 Jun 2010, Arne Jansen wrote:


Especially if the characteristics are different I find it a good 
idea to mix all on one set of spindles. This way you have lots of 
spindles for fast access and lots of space for the sake of space. If 
you devide the available spindles in two sets you will have much 
fewer spindles available for the responsiveness goal. I don't think 
taking them into a mirror can compensate that.


This is something that I can agree with.  Total vdevs in the pool is 
what primarily determines its responsiveness.  while using the same 
number of devices, splitting the pool might not result in more vdevs 
in either pool.  Mirrors do double the amount of readable devices but 
the side selected to read is random so the actual read performance 
improvement is perhaps on the order of 50% rather than 100%.  Raidz 
does steal IOPS so smaller raidz vdevs will help and result in more 
vdevs in the pool.


Bob
--
Bob Friesenhahn
bfrie...@simple.dallas.tx.us, http://www.simplesystems.org/users/bfriesen/
GraphicsMagick Maintainer,http://www.GraphicsMagick.org/
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread Roy Sigurd Karlsbakk
- Original Message -
  From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
  boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
 
  Close to 1TB SSD cache will also help to boost read
  speeds,
 
 Remember, this will not boost large sequential reads. (Could possibly
 maybe even hurt it.) This will only boost random reads.

As far as I can see, we mostly have random reads, and not too much large 
sequential I/O, so this is what I'm looking for.
 
Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-21 Thread James C. McPherson

On 22/06/10 01:05 AM, Fredrich Maney wrote:

On Mon, Jun 21, 2010 at 8:59 AM, James C. McPherson
j...@opensolaris.org  wrote:
[...]

So when I'm
trying to figure out who I need to yell at because they're
using more than our acceptable limit (30Gb), I have to run
du -s /builds/[zyx]. And that takes time. Lots of time.

[...]

Why not just use quotas?


Quotas are not always appropriate.

Also, given our usage model, and wanting to provide
a service that gatelings can use to work on multiple
changesets concurrently, we figure that telling people


your limit is XGb, and we will publicly shame you if
you exceed it, then go and remove old stuff for you


is sufficiently hands-off. We're adults here, not children
or kiddies with no regard for our fellow engineers.


James C. McPherson
--
Senior Software Engineer, Solaris
Oracle
http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-20 Thread Arne Jansen

Roy Sigurd Karlsbakk wrote:


I have read people are having problems with lengthy boot times with lots of 
datasets. We're planning to do extensive snapshotting on this system, so there 
might be close to a hundred snapshots per dataset, perhaps more. With 200 users 
and perhaps 10-20 shared department datasets, the number of filesystems, 
snapshots included, will be around 20k or more.


In my experience the boot time mainly depends on the number of datasets, not the
number of snapshots. 200 datasets is fairly easy (we have 7000, but did
some boot-time tuning).



Will trying such a setup be betting on help from some god, or is it doable? The 
box we're planning to use will have 48 gigs of memory and about 1TB L2ARC 
(shared with SLOG, we just use some slices for that).


Try. The main problem with having many snapshots is the time used for zfs list,
because it has to scrape all the information from disk, but with having so much
RAM/L2ARC that shouldn't be a problem here.
Another thing to consider is the frequency with which you plan to take the snap-
shots and if you want individual schedules for each dataset. Taking a snapshot
is a heavy-weight operation as it terminates the current txg.

Btw, what did you plan to use as L2ARC/slog?

--Arne





Vennlige hilsener / Best regards

roy
--
Roy Sigurd Karlsbakk
(+47) 97542685
r...@karlsbakk.net
http://blogg.karlsbakk.net/
--
I all pedagogikk er det essensielt at pensum presenteres intelligibelt. Det er 
et elementært imperativ for alle pedagoger å unngå eksessiv anvendelse av 
idiomer med fremmed opprinnelse. I de fleste tilfeller eksisterer adekvate og 
relevante synonymer på norsk.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-20 Thread David Magda

On Jun 20, 2010, at 11:55, Roy Sigurd Karlsbakk wrote:

There will also be a few common areas for each department and  
perhaps a backup area.


The back up area should be on a different set of disks.

IMHO, a back up isn't a back up unless it is an /independent/ copy of  
the data. The copy can be made via ZFS send/recv, tar, rsync, Legato/ 
NetBackup, etc., but it needs to be on independent media. Otherwise,  
if the original copy goes, so does the backup.


I have read people are having problems with lengthy boot times with  
lots of datasets. We're planning to do extensive snapshotting on  
this system, so there might be close to a hundred snapshots per  
dataset, perhaps more. With 200 users and perhaps 10-20 shared  
department datasets, the number of filesystems, snapshots included,  
will be around 20k or more.


You may also want to consider breaking things up into different pools  
as well. There seems to be an implicit assumption in this conversation  
that everything will be in one pool, and that may not be the best  
course of action.


Perhaps one pool for users' homedirs, and another for the departmental  
stuff? Or perhaps even two different pools for homedirs, with users  
'randomly' distributed between the two (though definitely don't do  
something like alphabetical (it'll be non-even) or departmental  
(people transfer) distribution).


This could add a bit of overhead, but I don't think have 2 or 3 pools  
would be much more of a big deal than one.


___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-20 Thread Ian Collins

On 06/21/10 03:55 AM, Roy Sigurd Karlsbakk wrote:

Hi all

We're working on replacing our current fileserver with something based on 
either Solaris or NexentaStor. We have about 200 users with variable needs. 
There will also be a few common areas for each department and perhaps a backup 
area. I think these should be separated with datasets, for simplicity and 
overview, but I'm not sure if it's a good idea.

I have read people are having problems with lengthy boot times with lots of 
datasets. We're planning to do extensive snapshotting on this system, so there 
might be close to a hundred snapshots per dataset, perhaps more. With 200 users 
and perhaps 10-20 shared department datasets, the number of filesystems, 
snapshots included, will be around 20k or more.

   
200 user filesystems isn't too big.  One of the systems I look after has 
about 1100 user filesystems with up to 20 snapshots each.  The impact on 
boot time is minimal.


--
Ian.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-20 Thread Edward Ned Harvey
 From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
 boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk
 
 Will trying such a setup be betting on help from some god, or is it
 doable? The box we're planning to use will have 48 gigs of memory and

There's nothing difficult about it.  Go ahead and test.

Personally, I don't see much value in using lots of separate filesystems.  
They're all in the same pool, right?  I use one big filesystem.

There are legitimate specific reasons to use separate filesystems in some 
circumstances.  But if you can't name one reason why it's better ... then it's 
not better for you.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] One dataset per user?

2010-06-20 Thread James C. McPherson

On 21/06/10 12:58 PM, Edward Ned Harvey wrote:

From: zfs-discuss-boun...@opensolaris.org [mailto:zfs-discuss-
boun...@opensolaris.org] On Behalf Of Roy Sigurd Karlsbakk

Will trying such a setup be betting on help from some god, or is it
doable? The box we're planning to use will have 48 gigs of memory and


There's nothing difficult about it.  Go ahead and test.

Personally, I don't see much value in using lots of separate filesystems.

 They're all in the same pool, right?  I use one big filesystem.

There are legitimate specific reasons to use separate filesystems in some

 circumstances.  But if you can't name one reason why it's better ...
 then it's not better for you.

On the build systems that I maintain inside the firewall,
we mandate one filesystem per user, which is a very great
boon for system administration. My management scripts are
considerably faster running when I don't have to traverse
whole directory trees (ala ufs).



James C. McPherson
--
Senior Software Engineer, Solaris
Oracle
http://www.jmcp.homeunix.com/blog
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss