Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Tomas Ögren
On 31 May, 2011 - Gertjan Oude Lohuis sent me these 0,9K bytes:

> On 05/31/2011 03:52 PM, Tomas Ögren wrote:
>> I've done a not too scientific test on reboot times for Solaris 10 vs 11
>> with regard to many filesystems...
>>
>
>> http://www8.cs.umu.se/~stric/tmp/zfs-many.png
>>
>> As the picture shows, don't try 1 filesystems with nfs on sol10.
>> Creating more filesystems gets slower and slower the more you have as
>> well.
>>
>
> Since all filesystem would be shared via NFS, this clearly is a nogo :).  
> Thanks!
>
>> On a different setup, we have about 750 datasets where we would like to
>> use a single recursive snapshot, but when doing that all file access
>> will be frozen for varying amounts of time
>
> What version of ZFS are you using? Like Matthew Ahrens said: version 27  
> has a fix for this.

22, Solaris 10.

/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Richard Elling
On May 31, 2011, at 2:29 PM, Gertjan Oude Lohuis wrote:

> On 05/31/2011 03:52 PM, Tomas Ögren wrote:
>> I've done a not too scientific test on reboot times for Solaris 10 vs 11
>> with regard to many filesystems...
>> 
> 
>> http://www8.cs.umu.se/~stric/tmp/zfs-many.png
>> 
>> As the picture shows, don't try 1 filesystems with nfs on sol10.
>> Creating more filesystems gets slower and slower the more you have as
>> well.
>> 
> 
> Since all filesystem would be shared via NFS, this clearly is a nogo :). 
> Thanks!

If you search the archives, you will find that the people who tried to do this 
in the
past were more successful with legacy NFS export methods than the sharenfs
property in ZFS.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Gertjan Oude Lohuis

On 05/31/2011 12:26 PM, Khushil Dep wrote:

Generally snapshots are quick operations but 10,000 such operations
would I believe take enough to time to complete as to present
operational issues - breaking these into sets would alleviate some?
Perhaps if you are starting to run into many thousands of filesystems
you would need to re-examin your rationale in creating so many.



Thanks for your feedback! My rationale is this: I have a lot of 
hostingaccounts which have databases. These databases need to be backed 
up, preferably with mysqldump and there need to be historic data. I 
would like to use ZFS snapshots for this. However, I have some variables 
that need to be taken into account:


* Different hostingplans offer different backupschedules: every 3 hour, 
every 24 hour. Backups might be kept 3 days, 14 day or 30 days. These 
schedules thus need to be on separate storage, otherwise I can't create 
a matching snapshot schedule to create and rotate snapshots.


* Databases are hosted on multiple databaseservers, and are frequently 
migrated between them. I could create a ZFS filesystem for each server, 
but if a hostingaccount is migrated, all backups will be 'lost'.


Having one filesystem for each hostingaccount would have solved nearly 
all disadvantages I could think of. But I don't think it is going to 
work, sadly. I'll have to make some choices :).


Regards,
Gertjan Oude Lohuis
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Gertjan Oude Lohuis

On 05/31/2011 03:52 PM, Tomas Ögren wrote:

I've done a not too scientific test on reboot times for Solaris 10 vs 11
with regard to many filesystems...




http://www8.cs.umu.se/~stric/tmp/zfs-many.png

As the picture shows, don't try 1 filesystems with nfs on sol10.
Creating more filesystems gets slower and slower the more you have as
well.



Since all filesystem would be shared via NFS, this clearly is a nogo :). 
Thanks!



On a different setup, we have about 750 datasets where we would like to
use a single recursive snapshot, but when doing that all file access
will be frozen for varying amounts of time


What version of ZFS are you using? Like Matthew Ahrens said: version 27 
has a fix for this.

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Matthew Ahrens
On Tue, May 31, 2011 at 6:52 AM, Tomas Ögren  wrote:

>
> On a different setup, we have about 750 datasets where we would like to
> use a single recursive snapshot, but when doing that all file access
> will be frozen for varying amounts of time (sometimes half an hour or
> way more). Splitting it up into ~30 subsets, doing recursive snapshots
> over those instead has decreased the total snapshot time greatly and cut
> the "frozen time" down to single digit seconds instead of minutes or
> hours.
>

If you can upgrade to zpool version 27 or later, you should see much much
less "frozen time" when doing a "zfs snapshot -r" of thousands of
filesystems.

--matt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Jim Klimov
In general, you may need to keep data in one dataset if it is somehow
related (i.e. backup of a specific machine or program, a user's home, etc) 
and if you plan to manage it in a consistent manner. For example, CIFS
shares can not be nested, so for a unitary share (like "distribs") you would
probably want one dataset. Also you can only have hardlinks within one
FS dataset, so if you manage different views into a distribution set
(i.e. sorted by vendor or sorted by software type) and if you do it
by hardlinks - you need one dataset as well. If you often move (link
and unlink) files around, i.e. from an "incoming" directory to final
storage, you may want or not want to have that "incoming" in the
same dataset, this depends on some other considerations too.
 
You want to split datasets when you need them to have different
features and perhaps different uses, i.e. to have them as separate
shares, to enforce separate quotas and reservations, perhaps to
delegate administration to particular OS users (i.e. let a user manage
snapshots of his own homedir) and/or local zones. Don't forget
about individual dataset properties (i.e. you may want compression
for source code files but not for a multimedia collection), snapshots
and clones, etc.
 
> 2. space management (we have wasted space in some pools while others
> are starved)
Well, that's a reason to decrease number of pools, but not datasets ;)
 
> 3. tool speed
> 
> I do not have good numbers for time to do 
> some of these operations
> as we are down to under 200 datasets (1/3 of the way through the
> migration to the new layout). I do have log entries that point to
> about a minute to complete a `zfs list` operation.
> 
> > Would I run into any problems when snapshots are taken (almost)
> > simultaneously from multiple filesystems at once?
> 
> Our logs show snapshot creation time at 2 
> seconds or less, but we
> do not try to do them all at once, we walk the list of datasets and
> process (snapshot and replicate) each in turn.

I can partially relate to that. We have a Thumper system running
OpenSolaris SXCE snv_177, with a separate dataset for each
user's home directory, for backups of each individual remote
machine, for each VM image, each local zone, etc. - in particular 
as to have separate history of snapshots and possibility to clone
what we need to.
 
Its relatively many filesystems (about 350) are or are not a problem 
depending on the tool used. For example, a typical import of the 
main pool may take up to 8 minutes when in safe mode,  but many 
of delays seem to be related to attempts to share_nfs and share_cifs
while the network is down ;)
 
Auto-snapshots are on, and listing them is indeed rather long:
 
[root@thumper ~]# time zfs list -tall -r pond | wc -l
   56528
real0m18.146s
user0m7.360s
sys 0m10.084s

[root@thumper ~]# time zfs list -tvolume -r pond | wc -l
   5
real0m0.096s
user0m0.025s
sys 0m0.073s

[root@thumper ~]# time zfs list -tfilesystem -r pond | wc -l
 353
real0m0.123s
user0m0.052s
sys 0m0.073s

Some operations like listing the filesystems SEEM slow due to the terminal,
but in fact are rather quick:
 
[root@thumper ~]# time df -k | wc -l
 363
real0m2.104s
user0m0.094s
sys 0m0.183s

However low-level system programs may have problems with multiple 
FSes; one known troublemaker is LiveUpgrade. Jens Elkner published
a wonderful set of patches for Solaris 10 and OpenSolaris to limit LU's
interests to just the filesystems that the admin knows to be interesting
for the OS upgrade (they also fix mount order and other known bugs
of that LU software release):
* http://iws.cs.uni-magdeburg.de/~elkner/luc/lutrouble.html
 
True, 1 FSes is not something I would have seen, so some tools
(especially legacy ones) may break at the sheer amount of mountpoints :)
 
One of my own tricks for cleaning snapshots, i.e. to free up pool space 
starvation quickly, is to use parallel "zfs destroy" invokations like this 
(note the ampersand):
 
# zfs list -t snapshot -r pond/export/home/user | grep @zfs-auto-snap | awk 
'{print $1}' | \
  while read Z ; do zfs destroy "$Z" & done
 
This may spawn several thousand processes (if called for the root dataset), 
but they often complete in just 1-2 minutes instead of hours for a one-by-one 
series of calls; I guess because this way many ZFS metadata operations 
are requested in a small timeframe and get coalesced into few big writes.
 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Eric D. Mudama

On Tue, May 31 at  8:52, Paul Kraus wrote:

   When we initially configured a large (20TB) files server about 5
years ago, we went with multiple zpools and multiple datasets (zfs) in
each zpool. Currently we have 17 zpools and about 280 datasets.
Nowhere near the 10,000+ you intend. We are moving _away_ from the
many dataset model to one zpool and one dataset. We are doing this for
the following reasons:

1. manageability
2. space management (we have wasted space in some pools while others
are starved)
3. tool speed

   I do not have good numbers for time to do some of these operations
as we are down to under 200 datasets (1/3 of the way through the
migration to the new layout). I do have log entries that point to
about a minute to complete a `zfs list` operation.


It would be interesting to see if you still had issues (#3) with 1 pool and
your 280 datasets.  It would definitely eliminate #2.

--
Eric D. Mudama
edmud...@bounceswoosh.org

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Jerry Kemp
Gertjan,

In addition to the comments directly relating from your post, we have
had similar discussions previously on the zfs-discuss list.

If you care to go and review the list archives, I can share that we have
had similar discussions on at least the following time periods.

March 2006
May 2008
January 2010
February 2010

There may be (and probably are) more stuff in the list archives, but I
know from my personal archives that these are good dates.

Hope this helps,

Jerry



On 05/31/11 05:08, Gertjan Oude Lohuis wrote:
> "Filesystem are cheap" is one of ZFS's mottos. I'm wondering how far
> this goes. Does anyone have experience with having more than 10.000 ZFS
> filesystems? I know that mounting this many filesystems during boot
> while take considerable time. Are there any other disadvantages that I
> should be aware of? Are zfs-tools still usable, like 'zfs list', 'zfs
> get/set'.
> Would I run into any problems when snapshots are taken (almost)
> simultaneously from multiple filesystems at once?
> 
> Regards,
> Gertjan Oude Lohuis
>
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Tomas Ögren
On 31 May, 2011 - Khushil Dep sent me these 4,5K bytes:

> The adage that I adhere to with ZFS features is "just because you can
> doesn't mean you should!". I would suspect that with that many
> filesystems the normal zfs-tools would also take an inordinate length
> of time to complete their operations - scale according to size.

I've done a not too scientific test on reboot times for Solaris 10 vs 11
with regard to many filesystems...

Quad Xeon machines with single raid10 and one boot environment. Using
more be's with LU in sol10 will make the situation even worse, as it's
LU that's taking time (re)mounting all filesystems over and over and
over and over again.
http://www8.cs.umu.se/~stric/tmp/zfs-many.png

As the picture shows, don't try 1 filesystems with nfs on sol10.
Creating more filesystems gets slower and slower the more you have as
well.

> Generally snapshots are quick operations but 10,000 such operations
> would I believe take enough to time to complete as to present
> operational issues - breaking these into sets would alleviate some?
> Perhaps if you are starting to run into many thousands of filesystems
> you would need to re-examin your rationale in creating so many.

On a different setup, we have about 750 datasets where we would like to
use a single recursive snapshot, but when doing that all file access
will be frozen for varying amounts of time (sometimes half an hour or
way more). Splitting it up into ~30 subsets, doing recursive snapshots
over those instead has decreased the total snapshot time greatly and cut
the "frozen time" down to single digit seconds instead of minutes or
hours.

> My 2c. YMMV.
> 
> -- 
> Khush
> 
> On Tuesday, 31 May 2011 at 11:08, Gertjan Oude Lohuis wrote:
> 
> > "Filesystem are cheap" is one of ZFS's mottos. I'm wondering how far
> > this goes. Does anyone have experience with having more than 10.000 ZFS
> > filesystems? I know that mounting this many filesystems during boot
> > while take considerable time. Are there any other disadvantages that I
> > should be aware of? Are zfs-tools still usable, like 'zfs list', 'zfs
> > get/set'.
> > Would I run into any problems when snapshots are taken (almost)
> > simultaneously from multiple filesystems at once?
> > 
> > Regards,
> > Gertjan Oude Lohuis
> > ___
> > zfs-discuss mailing list
> > zfs-discuss@opensolaris.org (mailto:zfs-discuss@opensolaris.org)
> > http://mail.opensolaris.org/mailman/listinfo/zfs-discuss
> 

> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss



/Tomas
-- 
Tomas Ögren, st...@acc.umu.se, http://www.acc.umu.se/~stric/
|- Student at Computing Science, University of Umeå
`- Sysadmin at {cs,acc}.umu.se
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Paul Kraus
On Tue, May 31, 2011 at 6:08 AM, Gertjan Oude Lohuis
 wrote:

> "Filesystem are cheap" is one of ZFS's mottos. I'm wondering how far
> this goes. Does anyone have experience with having more than 10.000 ZFS
> filesystems? I know that mounting this many filesystems during boot
> while take considerable time. Are there any other disadvantages that I
> should be aware of? Are zfs-tools still usable, like 'zfs list', 'zfs
> get/set'.

When we initially configured a large (20TB) files server about 5
years ago, we went with multiple zpools and multiple datasets (zfs) in
each zpool. Currently we have 17 zpools and about 280 datasets.
Nowhere near the 10,000+ you intend. We are moving _away_ from the
many dataset model to one zpool and one dataset. We are doing this for
the following reasons:

1. manageability
2. space management (we have wasted space in some pools while others
are starved)
3. tool speed

I do not have good numbers for time to do some of these operations
as we are down to under 200 datasets (1/3 of the way through the
migration to the new layout). I do have log entries that point to
about a minute to complete a `zfs list` operation.

> Would I run into any problems when snapshots are taken (almost)
> simultaneously from multiple filesystems at once?

Our logs show snapshot creation time at 2 seconds or less, but we
do not try to do them all at once, we walk the list of datasets and
process (snapshot and replicate) each in turn.

-- 
{1-2-3-4-5-6-7-}
Paul Kraus
-> Senior Systems Architect, Garnet River ( http://www.garnetriver.com/ )
-> Sound Coordinator, Schenectady Light Opera Company (
http://www.sloctheater.org/ )
-> Technical Advisor, RPI Players
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Khushil Dep
The adage that I adhere to with ZFS features is "just because you can doesn't 
mean you should!". I would suspect that with that many filesystems the normal 
zfs-tools would also take an inordinate length of time to complete their 
operations - scale according to size.

Generally snapshots are quick operations but 10,000 such operations would I 
believe take enough to time to complete as to present operational issues - 
breaking these into sets would alleviate some? Perhaps if you are starting to 
run into many thousands of filesystems you would need to re-examin your 
rationale in creating so many.

My 2c. YMMV.

-- 
Khush

On Tuesday, 31 May 2011 at 11:08, Gertjan Oude Lohuis wrote:

> "Filesystem are cheap" is one of ZFS's mottos. I'm wondering how far
> this goes. Does anyone have experience with having more than 10.000 ZFS
> filesystems? I know that mounting this many filesystems during boot
> while take considerable time. Are there any other disadvantages that I
> should be aware of? Are zfs-tools still usable, like 'zfs list', 'zfs
> get/set'.
> Would I run into any problems when snapshots are taken (almost)
> simultaneously from multiple filesystems at once?
> 
> Regards,
> Gertjan Oude Lohuis
> ___
> zfs-discuss mailing list
> zfs-discuss@opensolaris.org (mailto:zfs-discuss@opensolaris.org)
> http://mail.opensolaris.org/mailman/listinfo/zfs-discuss

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


[zfs-discuss] Experiences with 10.000+ filesystems

2011-05-31 Thread Gertjan Oude Lohuis
"Filesystem are cheap" is one of ZFS's mottos. I'm wondering how far
this goes. Does anyone have experience with having more than 10.000 ZFS
filesystems? I know that mounting this many filesystems during boot
while take considerable time. Are there any other disadvantages that I
should be aware of? Are zfs-tools still usable, like 'zfs list', 'zfs
get/set'.
Would I run into any problems when snapshots are taken (almost)
simultaneously from multiple filesystems at once?

Regards,
Gertjan Oude Lohuis
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss