Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-29 Thread Nicolas Williams
On Sat, Dec 27, 2008 at 02:29:58PM -0800, Ross wrote:
 All of which sound like good reasons to use send/receive and a 2nd zfs
 pool instead of mirroring.

Yes.

 Send/receive has the advantage that the receiving filesystem is
 guaranteed to be in a stable state.  How would you go about recovering

Among many other advantages.

 the system in the event of a drive failure though?  Would you have to
 replace the system drive, boot off a solaris DVD and then connect the
 external drive and send/receive it back?

You could boot from your backup, if you zfs sent it the relevant
snapshots of your boot datasets and installed GRUB on it.

So: replace main drive, boot from backup, backup the backup to the new
main drive, reboot from main drive.

If you want to only backup your user data (home directory) then yes,
you'd have to re-install, then restore the user data from the backup.

And yes, the restore procedure would involve a zfs send from the backup
to the new pool.

 It won't be quick, but replacing a failed single boot drive never is.
 Would it be possible to combine the send/receive backup with a
 scripted installation saved on the external media?  Something that

That would be nice, primarily so that you needn't backup anything that
can be simply re-installed.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-28 Thread dick hoogendijk
On Sat, 27 Dec 2008 14:29:58 PST
Ross myxi...@googlemail.com wrote:

 All of which sound like good reasons to use send/receive and a 2nd
 zfs pool instead of mirroring.
 
 Send/receive has the advantage that the receiving filesystem is
 guaranteed to be in a stable state.

Can send/receive be used on a multiuser running server system? Will
this slowdown the services on the server much?

Can the zfs receiving end be transformed into a normal file.bz2 or
has it always have to be a zfs system as a result?

-- 
Dick Hoogendijk -- PGP/GnuPG key: 01D2433D
+ http://nagual.nl/ | SunOS sxce snv104 ++
+ All that's really worth doing is what we do for others (Lewis Carrol)
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-28 Thread Volker A. Brandt
  Send/receive has the advantage that the receiving filesystem is
  guaranteed to be in a stable state.
 
 Can send/receive be used on a multiuser running server system?

Yes.

 Will
 this slowdown the services on the server much?

Depends.  On a modern box with good disk layout it shouldn't.

 Can the zfs receiving end be transformed into a normal file.bz2

Yes.  However, you have to carefully match the sending and receiving
ZFS versions, not all versions can read all streams.  If you delay
receiving the stream, it can happen that you won't be able to
unpack it any more.


Regards -- Volker
-- 

Volker A. Brandt  Consulting and Support for Sun Solaris
Brandt  Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 45
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-28 Thread Kees Nuyt
On Sun, 28 Dec 2008 15:27:00 +0100, dick hoogendijk
d...@nagual.nl wrote:

On Sat, 27 Dec 2008 14:29:58 PST
Ross myxi...@googlemail.com wrote:

 All of which sound like good reasons to use send/receive and a 2nd
 zfs pool instead of mirroring.
 
 Send/receive has the advantage that the receiving filesystem is
 guaranteed to be in a stable state.

Can send/receive be used on a multiuser running server system? 

Yes.

Will this slowdown the services on the server much?

Only if the server is busy, that is, has no idle CPU and I/O
capacity. It may help to nice the send process.

Sending a complete pool can be a considerable load for a
considerable time; an incremental send of a snapshot with
few changes relative to the previous one will be fast.

Can the zfs receiving end be transformed into a normal file.bz2
or has it always have to be a zfs system as a result?

Send streams are version dependent, it is advised to receive
it immediately.

If the receiving zfs pool uses a file as its block device,
you could export the pool and bzip that file.
-- 
  (  Kees Nuyt
  )
c[_]
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-28 Thread Tim
On Sat, Dec 27, 2008 at 3:24 PM, Miles Nordin car...@ivy.net wrote:

  t == Tim  t...@tcsac.net writes:

 t couldn't you simply do a detach before removing the disk, and
 t do a re-attach everytime you wanted to re-mirror?

 no, for two reasons.  First, when you detach a disk, ZFS writes
 something to the disk that makes it unrecoverable.  The simple-UI
 wallpaper blocks your access to the detached disk, so you have no
 redundancy while detached.  In this thread is a workaround to disable
 the checks (AIUI they're explaining a more fundamental problem with a
 multi-vdev pool because you can't detach one-mirror-half of each vdev
 at exactly the same instant, but multi-vdev is not part of Niall's
 case):

  http://opensolaris.org/jive/thread.jspa?threadID=58780


Gotcha, that's more than a bit ridiculous.  If I detach a disk, I guess I'd
expect to have to clear metadata if that's what I wanted, rather than it
automatically doing so.  I guess I almost feel there should either be a
secondary command, or some flags added for just such situations as this.
Personally I'd much rather have attach/detach commands than having to do a
zfs send.  Perhaps I'm alone in that feeling though.



 second, when you attach rather than online/clear/notice-its-back, ZFS
 will treat the newly-attached disk as empty and will resilver
 everything, not just your changes.  It's the difference between taking
 5 minutes and taking all night.  and you don't have redundancy until
 the resilver finishes.


Odd, my experience was definitely not the same.  When I re-attached, it did
not sync the entire disk.  Good to know that the expected behavior is
different than what I saw.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-27 Thread Miles Nordin
 t == Tim  t...@tcsac.net writes:

 t couldn't you simply do a detach before removing the disk, and
 t do a re-attach everytime you wanted to re-mirror?

no, for two reasons.  First, when you detach a disk, ZFS writes
something to the disk that makes it unrecoverable.  The simple-UI
wallpaper blocks your access to the detached disk, so you have no
redundancy while detached.  In this thread is a workaround to disable
the checks (AIUI they're explaining a more fundamental problem with a
multi-vdev pool because you can't detach one-mirror-half of each vdev
at exactly the same instant, but multi-vdev is not part of Niall's
case):

 http://opensolaris.org/jive/thread.jspa?threadID=58780

second, when you attach rather than online/clear/notice-its-back, ZFS
will treat the newly-attached disk as empty and will resilver
everything, not just your changes.  It's the difference between taking
5 minutes and taking all night.  and you don't have redundancy until
the resilver finishes.

Another interesting question is:

  (1) unplug the home USB disk

  (2) write to the internal laptop disk

  (3) retach the USB disk and start resilvering (the quick-resilver,
  not the newly-attached resilver)

  (a) laptop disk goes bad.  maybe a bunch of UNC's are uncovered
  during the resliver.  This is pretty plausible.

  Plausible how?  maybe you've been running without making
  backup for weeks, and then your machine started acting goofy
  so you said ``shit!  i haven't backed up!  i better do it
  now if I still can.''  you just plugged in because the
  laptop disk was _starting_ to go bad, and then it did.

  (b) system freezes and has to be hard-reset during the resilver

  (c) maybe after reboot you try again.  eventually you give up on
  the laptop disk and decide to lose your last month of
  changes.  you just want your data back as of the last
  backup, and a working machine.

  (4) what's the status of the home USB disk after these
  partial-resilvers?  `no valid replicas', or it works?  My
  impression is, at least sometimes or even usually, it will work.
  Is it guaranteed to always work, though?

Ordinary manual backups DO tolerate this scenario: if your machine
breaks while writing this week's incremental, you can still restore by
reading last week's incremental.

I remember AVS/ii had a brittle but planned scheme for this scenario,
too:

 * First, there is always a resilver source and target---for better or
   worse, it's not an ambiguous merging operation like ZFS.  

 * Second, before starting the resilver, you take an ii (device-level
   snapshot) of the resilver target.  The resilver is done UNsafely,
   but if it stops mid-way, you can roll back to the pre-resilver
   snapshot and get back the working home/target disk you had in (1).

SVM and Linux-LVM2 and most RAID-like mirrors do NOT handle this
scenario gracefully.


pgp8aF1fodWqv.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-27 Thread Ross
All of which sound like good reasons to use send/receive and a 2nd zfs pool 
instead of mirroring.

Send/receive has the advantage that the receiving filesystem is guaranteed to 
be in a stable state.  How would you go about recovering the system in the 
event of a drive failure though?  Would you have to replace the system drive, 
boot off a solaris DVD and then connect the external drive and send/receive it 
back?

It won't be quick, but replacing a failed single boot drive never is.  Would it 
be possible to combine the send/receive backup with a scripted installation 
saved on the external media?  Something that allows you to get working again 
quickly, with your data getting restored a little more slowly (but accessible 
in a read-only form on the backup disk while this happens)?

I appreciate this is a much more complex solution, but is it something that 
should be considered?
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-22 Thread Ross Smith
On Fri, Dec 19, 2008 at 6:47 PM, Richard Elling richard.ell...@sun.com wrote:
 Ross wrote:

 Well, I really like the idea of an automatic service to manage
 send/receives to backup devices, so if you guys don't mind, I'm going to
 share some other ideas for features I think would be useful.


 cool.

 One of the first is that you need some kind of capacity management and
 snapshot deletion.  Eventually backup media are going to fill and you need
 to either prompt the user to remove snapshots, or even better, you need to
 manage the media automatically and remove old snapshots to make space for
 new ones.


 I've implemented something like this for a project I'm working on.
 Consider this a research project at this time, though I hope to
 leverage some of the things we learn as we scale up, out, and
 refine the operating procedures.

Way cool :D

 There is a failure mode lurking here.  Suppose you take two sets
 of snapshots: local and remote.  You want to do an incremental
 send, for efficiency.  So you look at the set of snapshots on both
 machines and find the latest, common snapshot.  You will then
 send the list of incrementals from the latest, common through the
 latest snapshot.  On the remote machine, if there are any other
 snapshots not in the list you are sending and newer than the latest,
 common snapshot, then the send/recv will fail.  In practice, this
 means that if you use the zfs-auto-snapshot feature, which will
 automatically destroy older snapshots as it goes (eg. the default
 policy for frequent is take snapshots every 15 minutes, keep 4).

 If you never have an interruption in your snapshot schedule, you
 can merrily cruise along and not worry about this.  But if there is
 an interruption (for maintenance, perhaps) and a snapshot is
 destroyed on the sender, then you also must make sure it gets
 destroyed on the receiver.  I just polished that code yesterday,
 and it seems to work fine... though it makes folks a little nervous.
 Anyone with an operations orientation will recognize that there
 needs to be a good process wrapped around this, but I haven't
 worked through all of the scenarios on the receiver yet.

Very true.  In this context I think this would be fine.  You would
want a warning to pop up saying that a snapshot has been deleted
locally and will have to be overwritten on the backup, but I think
that would be ok.  If necessary you could have a help page explaining
why - essentially this is a copy of your pool, not just a backup of
your files, and to work it needs an accurate copy of your snapshots.
If you wanted to be really fancy, you could have an option for the
user to view the affected files, but I think that's probably over
complicating things.

I don't suppose there's any way the remote snapshot can be cloned /
separated from the pool just in case somebody wanted to retain access
to the files within it?


 I'm thinking that a setup like time slider would work well, where you
 specify how many of each age of snapshot to keep.  But I would want to be
 able to specify different intervals for different devices.

 eg. I might want just the latest one or two snapshots on a USB disk so I
 can take my files around with me.  On a removable drive however I'd be more
 interested in preserving a lot of daily / weekly backups.  I might even have
 an archive drive that I just store monthly snapshots on.

 What would be really good would be a GUI that can estimate how much space
 is going to be taken up for any configuration.  You could use the existing
 snapshots on disk as a guide, and take an average size for each interval,
 giving you average sizes for hourly, daily, weekly, monthly, etc...


 ha ha, I almost blew coffee out my nose ;-)  I'm sure that once
 the forward time-slider functionality is implemented, it will be
 much easier to manage your storage utilization :-)  So, why am
 I giggling?  My wife just remembered that she hadn't taken her
 photos off the camera lately... 8 GByte SD cards are the vehicle
 of evil destined to wreck your capacity planning :-)

Haha, that's a great image, but I've got some food for thought even with this.

If you think about it, even though 8GB sounds a lot, it's barely over
1% of a 500GB drive, so it's not an unmanageable blip as far as
storage goes.

Also, if you're using the default settings for Tim's backups, you'll
be taking snapshots every 15 minutes, hour, day, week and month.  Now,
when you start you're not going to have any sensible averages for your
monthly snapshot sizes, but you're very rapidly going to get a set of
figures for your 15 minute snapshots.

What I would suggest is to use those to extrapolate forwards to give
very rough estimates of usage early on, with warnings as to how rough
these are.  In time these estimates will improve in accuracy, and your
8GB photo 'blip' should be relatively easily incorporated.

What you could maybe do is have a high and low usage estimate shown in
the GUI.  Early on these will be quite a 

Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-19 Thread Richard Elling
Ross wrote:
 Well, I really like the idea of an automatic service to manage send/receives 
 to backup devices, so if you guys don't mind, I'm going to share some other 
 ideas for features I think would be useful.
   

cool.

 One of the first is that you need some kind of capacity management and 
 snapshot deletion.  Eventually backup media are going to fill and you need to 
 either prompt the user to remove snapshots, or even better, you need to 
 manage the media automatically and remove old snapshots to make space for new 
 ones.
   

I've implemented something like this for a project I'm working on.
Consider this a research project at this time, though I hope to
leverage some of the things we learn as we scale up, out, and
refine the operating procedures.

There is a failure mode lurking here.  Suppose you take two sets
of snapshots: local and remote.  You want to do an incremental
send, for efficiency.  So you look at the set of snapshots on both
machines and find the latest, common snapshot.  You will then
send the list of incrementals from the latest, common through the
latest snapshot.  On the remote machine, if there are any other
snapshots not in the list you are sending and newer than the latest,
common snapshot, then the send/recv will fail.  In practice, this
means that if you use the zfs-auto-snapshot feature, which will
automatically destroy older snapshots as it goes (eg. the default
policy for frequent is take snapshots every 15 minutes, keep 4).

If you never have an interruption in your snapshot schedule, you
can merrily cruise along and not worry about this.  But if there is
an interruption (for maintenance, perhaps) and a snapshot is
destroyed on the sender, then you also must make sure it gets
destroyed on the receiver.  I just polished that code yesterday,
and it seems to work fine... though it makes folks a little nervous.
Anyone with an operations orientation will recognize that there
needs to be a good process wrapped around this, but I haven't
worked through all of the scenarios on the receiver yet.

 I'm thinking that a setup like time slider would work well, where you specify 
 how many of each age of snapshot to keep.  But I would want to be able to 
 specify different intervals for different devices.

 eg. I might want just the latest one or two snapshots on a USB disk so I can 
 take my files around with me.  On a removable drive however I'd be more 
 interested in preserving a lot of daily / weekly backups.  I might even have 
 an archive drive that I just store monthly snapshots on.

 What would be really good would be a GUI that can estimate how much space is 
 going to be taken up for any configuration.  You could use the existing 
 snapshots on disk as a guide, and take an average size for each interval, 
 giving you average sizes for hourly, daily, weekly, monthly, etc...
   

ha ha, I almost blew coffee out my nose ;-)  I'm sure that once
the forward time-slider functionality is implemented, it will be
much easier to manage your storage utilization :-)  So, why am
I giggling?  My wife just remembered that she hadn't taken her
photos off the camera lately... 8 GByte SD cards are the vehicle
of evil destined to wreck your capacity planning :-)

 That could then be used in a GUI (I'm thinking a visual column with colours 
 for each type of snapshot showing how full the drive would be).  You know the 
 size of the external drive (and that's fixed for each device), you also know 
 the average sizes of snapshots, so you can show the user how much space they 
 will have, and let them play around with the numbers.
   
I think there is some merit for this as a backwards-looking process.
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Nicolas Williams
On Wed, Dec 17, 2008 at 10:02:18AM -0800, Ross wrote:
 In fact, thinking about it, could this be more generic than just a USB
 backup service?

Absolutely.

The tool shouldn't need to know that the backup disk is accessed via
USB, or whatever.  The GUI should, however, present devices
intelligently, not as cXtYdZ!

 And when you think that some of those targets could actually be stored
 on full ZFS pools on other OpenSolaris servers being provided over
 comstar, this could be just as useful in corporate environments as at
 home.

Yup.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Ross Smith
 Absolutely.

 The tool shouldn't need to know that the backup disk is accessed via
 USB, or whatever.  The GUI should, however, present devices
 intelligently, not as cXtYdZ!

Yup, and that's easily achieved by simply prompting for a user
friendly name as devices are attached.  Now you could store that
locally, but it would be relatively easy to drop an XML configuration
file on the device too, allowing the same friendly name to be shown
wherever it's connected.

And this is sounding more and more like something I was thinking of
developing myself.  A proper Sun version would be much better though
(not least before I've never developed anything for Solaris!).
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Nicolas Williams
On Thu, Dec 18, 2008 at 07:05:44PM +, Ross Smith wrote:
  Absolutely.
 
  The tool shouldn't need to know that the backup disk is accessed via
  USB, or whatever.  The GUI should, however, present devices
  intelligently, not as cXtYdZ!
 
 Yup, and that's easily achieved by simply prompting for a user
 friendly name as devices are attached.  Now you could store that
 locally, but it would be relatively easy to drop an XML configuration
 file on the device too, allowing the same friendly name to be shown
 wherever it's connected.

I was thinking more something like:

 - find all disk devices and slices that have ZFS pools on them
 - show users the devices and pool names (and UUIDs and device paths in
   case of conflicts)

 - let the user pick one.

 - in the case that the user wants to initialize a drive to be a backup
   you need something more complex.

- one possibility is to tell the user when to attach the desired
  backup device, in which case the GUI can detect the addition and
  then it knows that that's the device to use (but be careful to
  check that the user also owns the device so that you don't pick
  the wrong one on multi-seat systems)

- another is to be much smarter about mapping topology to physical
  slots and present a picture to the user that makes sense to the
  user, so the user can click on the device they want.  This is much
  harder.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Ross Smith
On Thu, Dec 18, 2008 at 7:11 PM, Nicolas Williams
nicolas.willi...@sun.com wrote:
 On Thu, Dec 18, 2008 at 07:05:44PM +, Ross Smith wrote:
  Absolutely.
 
  The tool shouldn't need to know that the backup disk is accessed via
  USB, or whatever.  The GUI should, however, present devices
  intelligently, not as cXtYdZ!

 Yup, and that's easily achieved by simply prompting for a user
 friendly name as devices are attached.  Now you could store that
 locally, but it would be relatively easy to drop an XML configuration
 file on the device too, allowing the same friendly name to be shown
 wherever it's connected.

 I was thinking more something like:

  - find all disk devices and slices that have ZFS pools on them
  - show users the devices and pool names (and UUIDs and device paths in
   case of conflicts)..

I was thinking that device  pool names are too variable, you need to
be reading serial numbers or ID's from the device and link to that.

  - let the user pick one.

  - in the case that the user wants to initialize a drive to be a backup
   you need something more complex.

- one possibility is to tell the user when to attach the desired
  backup device, in which case the GUI can detect the addition and
  then it knows that that's the device to use (but be careful to
  check that the user also owns the device so that you don't pick
  the wrong one on multi-seat systems)

- another is to be much smarter about mapping topology to physical
  slots and present a picture to the user that makes sense to the
  user, so the user can click on the device they want.  This is much
  harder.

I was actually thinking of a resident service.  Tim's autobackup
script was capable of firing off backups when it detected the
insertion of a USB drive, and if you've got something sitting there
monitoring drive insertions you could have it prompt the user when new
drives are detected, asking if they should be used for backups.

Of course, you'll need some settings for this so it's not annoying if
people don't want to use it.  A simple tick box on that pop up dialog
allowing people to say don't ask me again would probably do.

You'd then need a second way to assign drives if the user changed
their mind.  I'm thinking this would be to load the software and
select a drive.  Mapping to physical slots would be tricky, I think
you'd be better with a simple view that simply names the type of
interface, the drive size, and shows any current disk labels.  It
would be relatively easy then to recognise the 80GB USB drive you've
just connected.

Also, because you're formatting these drives as ZFS, you're not
restricted to just storing your backups on them.  You can create a
root pool (to contain the XML files, etc), and the backups can then be
saved to a filesystem within that.

That means the drive then functions as both a removable drive, and as
a full backup for your system.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Nicolas Williams
On Thu, Dec 18, 2008 at 07:55:14PM +, Ross Smith wrote:
 On Thu, Dec 18, 2008 at 7:11 PM, Nicolas Williams
 nicolas.willi...@sun.com wrote:
  I was thinking more something like:
 
   - find all disk devices and slices that have ZFS pools on them
   - show users the devices and pool names (and UUIDs and device paths in
case of conflicts)..
 
 I was thinking that device  pool names are too variable, you need to
 be reading serial numbers or ID's from the device and link to that.

Device names are, but there's no harm in showing them if there's
something else that's less variable.  Pool names are not very variable
at all.

   - in the case that the user wants to initialize a drive to be a backup
you need something more complex.
 
 - one possibility is to tell the user when to attach the desired
   backup device, in which case the GUI can detect the addition and
   then it knows that that's the device to use (but be careful to
   check that the user also owns the device so that you don't pick
   the wrong one on multi-seat systems)
 
 I was actually thinking of a resident service.  Tim's autobackup
 script was capable of firing off backups when it detected the
 insertion of a USB drive, and if you've got something sitting there
 monitoring drive insertions you could have it prompt the user when new
 drives are detected, asking if they should be used for backups.

That will do.  Of course, there may be other uses for removable drives
than just backups, so this will probably have to be a plug-in framework.

 Of course, you'll need some settings for this so it's not annoying if
 people don't want to use it.  A simple tick box on that pop up dialog
 allowing people to say don't ask me again would probably do.

I would like something better than that.  Don't ask me again sucks
when much, much later you want to be asked and you don't know how to get
the system to ask you.

 You'd then need a second way to assign drives if the user changed
 their mind.  I'm thinking this would be to load the software and
 select a drive.  Mapping to physical slots would be tricky, I think
 you'd be better with a simple view that simply names the type of
 interface, the drive size, and shows any current disk labels.  It
 would be relatively easy then to recognise the 80GB USB drive you've
 just connected.

Right, so do as I suggested: tell the user to remove the device if it's
plugged in, then plug it in again.  That way you can known unambiguously
(unless the user is doing this with more than one device at a time).

 Also, because you're formatting these drives as ZFS, you're not
 restricted to just storing your backups on them.  You can create a
 root pool (to contain the XML files, etc), and the backups can then be
 saved to a filesystem within that.

Absolutely!

 That means the drive then functions as both a removable drive, and as
 a full backup for your system.

Yup.  Yet another reason for using zfs send|recv for backups instead of
mirrors.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Richard Elling
Nicolas Williams wrote:
 On Thu, Dec 18, 2008 at 07:55:14PM +, Ross Smith wrote:
   
 On Thu, Dec 18, 2008 at 7:11 PM, Nicolas Williams
 nicolas.willi...@sun.com wrote:
 
 I was thinking more something like:

  - find all disk devices and slices that have ZFS pools on them
  - show users the devices and pool names (and UUIDs and device paths in
   case of conflicts)..
   
 I was thinking that device  pool names are too variable, you need to
 be reading serial numbers or ID's from the device and link to that.
 

 Device names are, but there's no harm in showing them if there's
 something else that's less variable.  Pool names are not very variable
 at all.
   

I was thinking of something a little different.  Don't worry about
devices, because you don't send to a device (rather, send to a pool).
So a simple list of source file systems and a list of destinations
would do.  I suppose you could work up something with pictures
and arrows, like Nautilus, but that might just be more confusing
than useful.

But that is the easy part.  The hard part is dealing with the plethora
of failure modes...
 -- richard

___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Ross Smith
 Of course, you'll need some settings for this so it's not annoying if
 people don't want to use it.  A simple tick box on that pop up dialog
 allowing people to say don't ask me again would probably do.

 I would like something better than that.  Don't ask me again sucks
 when much, much later you want to be asked and you don't know how to get the 
 system to ask you.

Only if your UI design doesn't make it easy to discover how to add
devices another way, or turn this setting back on.

My thinking is that this actually won't be the primary way of adding
devices.  It's simply there for ease of use for end users, as an easy
way for them to discover that they can use external drives to backup
their system.

Once you have a backup drive configured, most of the time you're not
going to want to be prompted for other devices.  Users will generally
setup a single external drive for backups, and won't want prompting
every time they insert a USB thumb drive, a digital camera, phone,
etc.

So you need that initial prompt to make the feature discoverable, and
then an easy and obvious way to configure backup devices later.

 You'd then need a second way to assign drives if the user changed
 their mind.  I'm thinking this would be to load the software and
 select a drive.  Mapping to physical slots would be tricky, I think
 you'd be better with a simple view that simply names the type of
 interface, the drive size, and shows any current disk labels.  It
 would be relatively easy then to recognise the 80GB USB drive you've
 just connected.

 Right, so do as I suggested: tell the user to remove the device if it's
 plugged in, then plug it in again.  That way you can known unambiguously
 (unless the user is doing this with more than one device at a time).

That's horrible from a users point of view though.  Possibly worth
having as a last resort, but I'd rather just let the user pick the
device.  This does have potential as a help me find my device
feature though.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Ross Smith
 I was thinking more something like:

  - find all disk devices and slices that have ZFS pools on them
  - show users the devices and pool names (and UUIDs and device paths in
  case of conflicts)..


 I was thinking that device  pool names are too variable, you need to
 be reading serial numbers or ID's from the device and link to that.


 Device names are, but there's no harm in showing them if there's
 something else that's less variable.  Pool names are not very variable
 at all.


 I was thinking of something a little different.  Don't worry about
 devices, because you don't send to a device (rather, send to a pool).
 So a simple list of source file systems and a list of destinations
 would do.  I suppose you could work up something with pictures
 and arrows, like Nautilus, but that might just be more confusing
 than useful.

True, but if this is an end user service, you want something that can
create the filesystem for them on their devices.  An advanced mode
that lets you pick any destination filesystem would be good for
network admins, but for end users they're just going to want to point
this at their USB drive.

 But that is the easy part.  The hard part is dealing with the plethora
 of failure modes...
 -- richard

Heh, my response to this is who cares? :-D

This is a high level service, it's purely concerned with backup
succeeded or backup failed, possibly with an overdue for backup
prompt if you want to help the user manage the backups.

Any other failure modes can be dealt with by the lower level services
or by the user.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Nicolas Williams
On Thu, Dec 18, 2008 at 12:57:54PM -0800, Richard Elling wrote:
 Nicolas Williams wrote:
 Device names are, but there's no harm in showing them if there's
 something else that's less variable.  Pool names are not very variable
 at all.
 
 I was thinking of something a little different.  Don't worry about
 devices, because you don't send to a device (rather, send to a pool).

Right, which is why I want a pool name listed :)

 So a simple list of source file systems and a list of destinations
 would do.  I suppose you could work up something with pictures
 and arrows, like Nautilus, but that might just be more confusing
 than useful.

Problem is: say you have N removable devices plugged in.  Which to use?
You can eliminate the ones that don't have pools on them.  And then?  In
the end you definitely need user input *when the backup device is
initialized*.  After that the backups can and should be automatic (plug
it in, there it goes).

 But that is the easy part.  The hard part is dealing with the plethora
 of failure modes...

Well, yes, but in GUIs you do that by throwing dialogs at the user :^/

The main failure mode is the device getting removed while the pool is
still imported.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-18 Thread Ross
Well, I really like the idea of an automatic service to manage send/receives to 
backup devices, so if you guys don't mind, I'm going to share some other ideas 
for features I think would be useful.

One of the first is that you need some kind of capacity management and snapshot 
deletion.  Eventually backup media are going to fill and you need to either 
prompt the user to remove snapshots, or even better, you need to manage the 
media automatically and remove old snapshots to make space for new ones.

I'm thinking that a setup like time slider would work well, where you specify 
how many of each age of snapshot to keep.  But I would want to be able to 
specify different intervals for different devices.

eg. I might want just the latest one or two snapshots on a USB disk so I can 
take my files around with me.  On a removable drive however I'd be more 
interested in preserving a lot of daily / weekly backups.  I might even have an 
archive drive that I just store monthly snapshots on.

What would be really good would be a GUI that can estimate how much space is 
going to be taken up for any configuration.  You could use the existing 
snapshots on disk as a guide, and take an average size for each interval, 
giving you average sizes for hourly, daily, weekly, monthly, etc...

That could then be used in a GUI (I'm thinking a visual column with colours for 
each type of snapshot showing how full the drive would be).  You know the size 
of the external drive (and that's fixed for each device), you also know the 
average sizes of snapshots, so you can show the user how much space they will 
have, and let them play around with the numbers.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Ross
Thinking about it, I think Darren is right.  An automatic send/receive to the 
external drive may be preferable, and it sounds like it has many advantages:

1.  It's directional, your backups will always go from your live drive to the 
backup, never the other way unless you actually force it with -f.

2.  It protects any changes on your backup drive.

3.  It doesn't affect the performance of your original drive.

4.  You can run the actual send/receive at a lower priority (could it be 
throttled too?), reducing the impact on the system.

5.  Differently sized disks work fine, and in fact a larger external disk 
potentially allows you to store more snapshots on there than on your live 
system.

6.  Tim Foster already has something like this working:
 http://blogs.sun.com/timf/entry/zfs_automatic_backup_0_1

The only things missing that I can think of are ETA calculations for send / 
receive, and the fact that I'm not sure how you would boot or restore from it.  
A manual restore process wouldn't be too much of a hassle though, and if we're 
honest, an ETA or even a progress indicator for zfs send/receive would be a 
godsend for Solaris anyway.

After all, don't you just love it when you're backing up a 2TB storage array 
and your boss says how long is that going to take, and your only answer is 
haven't a clue boss.  Theoretically around 8 hours.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Ross
In fact, thinking about it, could this be more generic than just a USB backup 
service?

If this were a scheduled backup system, regularly sending snapshots to another 
device, with a nice end user GUI, there's nothing stopping it working with any 
device the user points it at.  So you could use USB, Firewire, SATA, iSCSI, 
SAS, Fibre channel... 

And when you think that some of those targets could actually be stored on full 
ZFS pools on other OpenSolaris servers being provided over comstar, this could 
be just as useful in corporate environments as at home.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Nicolas Williams
On Wed, Dec 17, 2008 at 12:05:50AM -0800, Ross wrote:
 Thinking about it, I think Darren is right.  An automatic send/receive to the 
 external drive may be preferable, and it sounds like it has many advantages:

You forgot *the* most important advantage of using send/recv instead of
mirroring as the backup method:

 - Your backup is a pool in its own right that can be imported anywhere

   Detached mirrors don't get detached in such a way that you could then
   import them as a separate pool.  An option to do that would be
   excellent though: detach as pool, with pool name and UUID assigned on
   detach, plus a way to re-attach mirrors detached that way.


Other major advantages of this method:

 - You get to be selective as to which snapshots to backup

   With a mirror it's all-or-nothing.  With send/recv you get to choose
   what to send to the backup.


 - You get to be selective as to which snapshots to delete from your backup

   If you mirror then you lose any snapshots which have been deleted in
   the primary drive.


Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Niall Power
 What serious compat issues ?  There has been one and
 only one 
 incompatible change in the stream format and that
 only impacted really 
 really early (before S10 FCS IIRC) adopters.

Here are the issues that I am aware: 
 - Running zfs upgrade on a zfs filesystem will cause the zfs send stream 
output format to be incompatible with older versions of the software. This is 
according to the zfs man page.
 - Again from the zfs man page:
 The format of the stream is evolving. No backwards  com-
 patibility is guaranteed. You may not be able to receive
 your streams on future versions of ZFS.
   So while the stream format has only changed once in the past, it doesn't 
provide much
   reassurance that it won't change in the future. This basically rules out 
network backups
   unless the remote target is running a compatible verisons of ZFS since file 
blobs of the
   stream maybe incompatible in the future.
- From the thread I linked to in my original post, it was pointed out that 
there is no error
  checking or checksum info in the file blobs of the stream.

These would appear to be real blockers for this approach for us. Am I 
misunderstanding the
issues? Or is there a viable solution that isn't subject to these constraints?


 * Simplified UI - user doesn't have to configure
 backup schedules etc.
 
 I actually see that as a downside, but given we have
 autosnapshots
 with time-slider it is acceptable.

Why would it be a downside? Normal user's don't like doing backups.
Our software will be more useful if it doesn't have to rely on the user
doing something we know they are not going to do until it's too late.

 
 * Resynchronisation is always optimal because
 zfs handles it directly rather than some 
external program that can't optimise as
 effectively
 
 However the MAJOR disadvantage of using mirroring
 though is that the 
 backup disk needs to be at least as large as the
 normal rpool disk (and 
 will only be used to the exact size - resulting in
 wastage).  Rather 
 than when using zfs send/recv where the backup disk
 only needs to be big 
 enough to hold the data stored on the rpool.
 

This is true, but consumer level storage is s cheap nowadays.
One company has even turned it into a excuse to sell external storage
devices. I think they're named after some kind of fruit or something :)


 
 For example I have a 150G internal drive in my laptop
 but the smallest 
 drive I can easily buy in a retail/consumer enclosure
 to attach via USB 
 is 500G that is a massive amount of waste.  At the
 other end of the 
 scale USB flash drives are around the 16G mark but
 might be plenty big 
 enough for the datasets I actually care about (like
 my data not the OS 
 and not my Music because I have other copies of that
 on my iPods anyway).

We could create a partition on the disk that's 
large enough to mirror the primary pool, leaving the
rest of the disk free for the user to use as they wish.

 
 Using zfs send/recv also allows for the possibility
 of using a single 
 backup disk (pool) for multiple machines, making
 better use of that 500G 
 USB drive than backing up a single 150G internal
 laptop drive.
 
 Really to do this properly with mirrors the zpool
 split capability 

I haven't heard of this before. Any pointers?

Thanks,
Niall.

 needs to be implemented, however I think it is the
 wrong solution - or 
 rather a complementary one to using zfs send/recv (or
 even using rsync 
 if preservation of the snapshots isn't important).
 
 -- 
 Darren J Moffat
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Nicolas Williams
On Wed, Dec 17, 2008 at 08:51:54AM -0800, Niall Power wrote:
  What serious compat issues ?  There has been one and
  only one 
  incompatible change in the stream format and that
  only impacted really 
  really early (before S10 FCS IIRC) adopters.
 
 Here are the issues that I am aware: 
  - Running zfs upgrade on a zfs filesystem will cause the zfs send
  stream output format to be incompatible with older versions of the
  software. This is according to the zfs man page.  - Again from the
  zfs man page:
  The format of the stream is evolving. No backwards  com-
  patibility is guaranteed. You may not be able to receive
  your streams on future versions of ZFS.
So while the stream format has only changed once in the past, it
doesn't provide much reassurance that it won't change in the
future. This basically rules out network backups unless the remote
target is running a compatible verisons of ZFS since file blobs of
the stream maybe incompatible in the future.

The solution to this is to recv the stream immediately.

So you have a disk for backups, right?  Make it a pool, and zfs recv
into it your zfs send backups.  That means the zfs sends are transient,
so no compatibility issues will arise.

And you get a number of benefits that I described earlier today.

 - From the thread I linked to in my original post, it was pointed out
 that there is no error checking or checksum info in the file blobs of
 the stream.

I guess you mean that zfs recv doesn't verify the integrity of zfs send
streams.  But you can ensure the integrity of zfs send streams easily
(e.g., use SSHv2 to move them around and always store them on ZFS
datasets if you must store them).

 These would appear to be real blockers for this approach for us. Am I
 misunderstanding the issues? Or is there a viable solution that isn't
 subject to these constraints?

They are not blockers, not remotely for this use.  See above.

  * Resynchronisation is always optimal because

See my reply to Ross today about this.  Using send/recv you get to
decide what snapshots to backup and what backed up snapshots to delete.
You can't do the same with the mirroring approach.

 We could create a partition on the disk that's 
 large enough to mirror the primary pool, leaving the
 rest of the disk free for the user to use as they wish.

Too complicated.  Make the backup disk a pool and recv zfs sends streams
into it.

  Really to do this properly with mirrors the zpool
  split capability 

Yes, but let's not do it with mirrors.

Nico
-- 
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Ross
Hey Niall,

 Here are the issues that I am aware: 
 - Running zfs upgrade on a zfs filesystem will
 cause the zfs send stream output format to be
 incompatible with older versions of the software.
  This is according to the zfs man page.
 - Again from the zfs man page:
 The format of the stream is evolving. No
  backwards  com-
 patibility is guaranteed. You may not be able
 to receive
  your streams on future versions of ZFS.

Well yes, but this really shouldn't be a problem for this usage.  It just means 
there's no guarantee that you can always do a send/receive between different 
versions of Solaris (although most of the time you'll be ok), but there's 
nothing stopping you doing a send/receive to an external device.  It will never 
be a problem in this situation because you're doing the send/receive on the 
same system, so you can't be running two clashing versions.

The absolute worst case scenario I can think of is that you might have to keep 
the external device's zfs pool running the same version of zfs as your live 
pool.  I really don't see this being a problem for you.

 From the thread I linked to in my original post, it
 was pointed out that there is no error
 checking or checksum info in the file blobs of the
  stream.

No, there's no error checking while *sending* the stream.  However, ZFS checks 
as it's receiving it, so provided you are receiving this into a live zfs pool 
this isn't a concern.

The checksum issue is only a problem if you're storing the zfs send as a binary 
blob.  Receiving it into a proper zfs pool is fine.

It's not quite so flexible as mirroring in that you can't do this as 
incrementally as mirrors (ie you can't restart the operation), for each stream 
it's all or nothing.  However, after the initial synchronisation you'll be 
sending incremental snapshots so there shouldn't be too much data to send.  And 
that ties in perfectly with time slider's automatic snapshots.

I would also hope that zfs send/receive might improve in the future to allow 
for resuming sends since this seems a relatively common request.

  However the MAJOR disadvantage of using mirroring
  though is that the 
  backup disk needs to be at least as large as the
  normal rpool disk (and 
  will only be used to the exact size - resulting in
  wastage).  Rather 
  than when using zfs send/recv where the backup
 disk
  only needs to be big 
  enough to hold the data stored on the rpool.
  
 
 This is true, but consumer level storage is s
 cheap nowadays.
 One company has even turned it into a excuse to sell
 external storage
 devices. I think they're named after some kind of
 fruit or something :)

Maybe, but being able to store *years* worth of snapshots on your external 
media, even if you only have space for a few months worth on your live system 
is a big plus.

Also, there's no need to keep just one external backup drive.  You could just 
as easily send to two of them.  Or even buy a 500GB drive and synchronise all 
your snapshots to that for a year, then buy another one when it's full and 
start sending the next lot of snapshots.

There's potential to keep far, far more data this way, in a much more flexible 
way.  And thinking about it, I think it ties in perfectly with time slider.  
You could have a second set of options for how long you want to keep snapshots 
on your external devices.

You could even have those options configurable per device.  So you could have a 
200GB backup device that you just keep your recent data on, and a 1TB one that 
you use occasionally but store years worth of snapshots on.

By running a script as the devices are plugged in, you could check the pools to 
synchronise and the last snapshot received.  From there you could look at the 
local rules and decide which new snapshots need sending over to bring the 
device up to date.

It also means you can show status more easily, without any of the confusion 
mirrors would cause in zpool status.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-17 Thread Niall Power
 In the long run some USB stick problems may surface
 because the wearbr
 leveling is done in 16MB sections, and you could blow
 your stick ifbr
 you have a 16MB region which is ``hot#39;#39;.
 nbsp;I wonder if parts of a zpoolbr
 are hotter than others? nbsp;With AVS the dirty
 bitmap might be hot.br
 br
 I guess you are not erally imagining sticks though,
 just testing withbr
 them. nbsp;You#39;re imagining something more like
 the time capsule, wherebr
 the external drive is bigger than the internal one,
 that it#39;ll be usedbr
 more on laptops. nbsp;At home you keep a large,
 heavy disk which holds abr
 mirror of your laptop ZFS root on one slice, plus an
 unredundantbr
 scratch pool made of the extra free space.br

Yes that's exactly what I had in mind. I as ambiguous about the
specifics, but I actually was testing with an external notebook
drive enclosure connected via USB. Probably wouldn't have as
fancy a name as Time Capsule though since it would work with
off the shelf parts :)
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Ross
Does 1. really need to be fixed?

I ask this since I imagine there will be some resistance from the ZFS team to 
essentially breaking the spec for the sake of not confusing some users.

I would argue that anybody who knows enough to run zpool status is also 
capable of learning what a mirror is and how it works, and that this is then 
more a training / documentation / expectations issue.  In your documentation 
for this feature, include a section for advanced users explaining what zpool 
status will show and I don't see any problem.

What I would suggest is that since this is end user / desktop functionality, 
why don't you create a desktop GUI for reporting on the status of the backup 
mirror?  That would seem to make more sense to me than modifying ZFS.  You 
could create a nice simple GUI that shows whether the device is connected or 
not, and gives a rough estimate to the user of how far through the resilver 
process it is.

You could even have it live in the system tray, with the icon showing whether 
the device is connected, disconnected or resilvering, with the tooltip 
reporting the resilver status.

That sounds a lot nice for end users than having them try to interpret zpool 
status.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Niall Power
Andrew Gabriel wrote:

 Different USB memory sticks vary enormously in speed.
 The speed is often not described on the packaging, so it's often not 
 possible to know how fast one is until after you've bought it and 
 tried it.

This was tested with an external laptop hardisk inside a USB enclosure. 
There's an assumption
I should have made clearer from the outset, that is that we intend to 
mirror the entire root pool
of an opensolaris installation rpool which could be all of or a large 
chunk of capacity of a
modern hard disk. For mirroring the root pool on an opensolaris system, 
a usb memory stick
probably won't have the necessary capacity in anything but a very 
limited set of circumstances.
At least disks are cheap :)

Thanks,
Niall.
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Niall Power
 Does 1. really need to be fixed?
 

I'm not suggesting that it's currently broken I'm just asking if it would be 
reasonable to special case our usage a little bit in order to avoid unnecessary
alarm to users. This will be seen as a fit and finish/polish issue. If it's 
easy to
address that then we should try to. I accept that it may not be as straight 
forward
as I hope however.

Perhaps time-slider could set a reserved property on the mirrored zpool to 
indicate
to ZFS that this mirror device could get unplugged a lot and simply tailor 
the 
status message and leave everything else as is?
I don't personally see it as a blocker though, but definitely a nice to have.

 I ask this since I imagine there will be some
 resistance from the ZFS team to essentially breaking
 the spec for the sake of not confusing some users.

I'm not expecting the existing spec to be broken, I'm asking if an augmentation 
to
the spec would be possible. I too would be opposed to any changes that would
break the existing spec since what we want for time-slider is special case 
usage.

 
 I would argue that anybody who knows enough to run
 zpool status is also capable of learning what a
 mirror is and how it works, and that this is then
 more a training / documentation / expectations issue.
 In your documentation for this feature, include a
 section for advanced users explaining what zpool
  status will show and I don't see any problem.
 

That might be an acceptable solution too, if addressing 1. is not feasible.

 What I would suggest is that since this is end user /
 desktop functionality, why don't you create a desktop
 GUI for reporting on the status of the backup
 mirror?  That would seem to make more sense to me
 than modifying ZFS.  You could create a nice simple
 GUI that shows whether the device is connected or
 not, and gives a rough estimate to the user of how
 far through the resilver process it is.
 
 You could even have it live in the system tray, with
 the icon showing whether the device is connected,
 disconnected or resilvering, with the tooltip
 reporting the resilver status.
 
Yep, we're considering this also, exactly along the lines you suggest.
From my own observations it seems that the resilver completion 
estimates are rather inaccurate. We may have to restrict the scope of
notification to simple event responses (connected:uptodate - disconnected -
reconnected:resilvering, connected:uptodate)


 That sounds a lot nice for end users than having them
 try to interpret zpool status.

It does, but it's generally bad form to have the CLI and GUI in disagreement 
with each other or be inconsistent about how they inform the user.

One other question I have about using mirrors is potential performance 
implications.
In a common scenario the user might be using the main S(ATA) attached disk and
a USB external disk as a mirror configuration. Could the slower disk become a 
bottleneck because of it's lower I/O read/write speeds? Would a system write 
block
until ZFS had written the data to both sides of the mirror?

Similarly, could the detached mirror device slow down reads/writes because it's 
doing extra work to cope with the missing mirror device?

Thanks,
Niall.
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Niall Power
Hi Volker,

 Yes, by all means.  I am doing something very similar
 on my T1000, but
 I have two separate one-disk pools and copy to the
 backup pool using
 rsync.  I would very much like to replace this with
 automatic resilvering.
 
 One prerequisite for wide adoption would be to fix
 the issue #1 you
 described above.  I would advise not to integrate
 this anywhere before
 fixing that degraded display.
 
 BTW is this USB-specific?  While it seems to imply
 that, you don't state
 it anywhere explicitly.  I attach my backup disk via
 eSATA, power it up,
 import the pool, etc.  Not really hotplugging...

No, it's definitely not USB specific. We're in principle happy with any
type of block storage device what ZFS is happy with presuming it
has enough capacity to function as a mirror to the full capacity of
the device on which the existing root pool exists.

So a second hard disk connected internally inside the system would be
equally fine and even preferable, but that's probably not the common 
case since we expect most users to be laptop users who will be connecting
external disks via USB, or maybe firewire or eSATA in a minority of cases.
If the solution is capable of dealing with hotplug then presumably it should
have no problem with permanently attached devices or devices that are 
attached before boot.

Just for clarification I'll mention that we're not considering anything more 
exotic
than a 2 way mirror configuration, and we will be assuming a single pool 
containing
the root and user home directory filesystems with in it. Users running raidZ or 
multiple
pools etc. would be beyond the scope of what we're aiming for.

Thanks,
Niall.

Thanks,
Niall
 
 
 Regards -- Volker
 -- 
 --
 --
 Volker A. Brandt  Consulting and
 Support for Sun Solaris
 Brandt  Brandt Computer GmbH   WWW:
 http://www.bb-c.de/
 Am Wiesenpfad 6, 53340 Meckenheim
 Email: v...@bb-c.de
 sgericht Bonn, HRB 10513  Schuhgröße: 45
 Geschäftsführer: Rainer J. H. Brandt und Volker A.
 Brandt
 ___
 zfs-discuss mailing list
 zfs-discuss@opensolaris.org
 http://mail.opensolaris.org/mailman/listinfo/zfs-discu
 ss
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Niall Power
 
 Yes to both I believe, while the USB device is
 attached your system will run slower, and it will run
 considerably slower while replicating data.
 Hopefully USB 3 or eSATA drives would address this
  to some extent.

I think I've confirmed this is the case, at least in the configuration
I tried. With a USB mirror device configured a time write of a 200m
file took about 19 seconds on average with the USB drive detached.
After reattaching and waiting for the resilver to complete, the write took
on average 25 seconds. Detaching the USB mirror from the pool entirely
and just having the single laptop disk in the pool gave the best results
at about 16 seconds average. Obviously this is all purely anecdotal data
but it would appear to agree with the presumptions.
On a positive note, it seems that the performance hit is much worse for
when both sides of the mirror are online, compared to when it's detached
and the pool is degraded but I'm sure that we'll pay for it later when the
disk is reattached later and it resilvers. But maybe that's acceptable for
our target audience.
It definitely seems like there are performance issues that I need to better
understand before jumping in feet first with this. 
Would a lop sided mirror behave any differently in comparison to this?
Does it also ensure that writes are sent to all sides of the mirror before 
returning?

Thanks,
Niall.
 

 
 However, the idea of lop sided mirrors was discussed
 a while back in the availability thread (warning,
 it's long:
  http://www.opensolaris.org/jive/message.jspa?messageI
 =311743).  There are many people who want ZFS to
 support lop sided mirrors, and I've tried to raise an
 RFE twice, but I don't believe I've ever seen a bug
 ID for it, it seems to get lost in the system.
 
 And yes, the resilver estimates are all over the
 place, even when you don't allow for the fact that it
 still restarts regularly.  Would you be able to get
 away with just reporting the percentage complete and
 ignoring the estimates until that code is improved?
 
 Ross
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Ross
 One other question I have about using mirrors is
 potential performance implications.
 In a common scenario the user might be using the main
 S(ATA) attached disk and
 a USB external disk as a mirror configuration. Could
 the slower disk become a 
 bottleneck because of it's lower I/O read/write
 speeds? Would a system write block
 until ZFS had written the data to both sides of the
 mirror?
 
 Similarly, could the detached mirror device slow down
 reads/writes because it's 
 doing extra work to cope with the missing mirror
 device?
 
 Thanks,
 Niall.

Yes to both I believe, while the USB device is attached your system will run 
slower, and it will run considerably slower while replicating data.  Hopefully 
USB 3 or eSATA drives would address this to some extent.

However, the idea of lop sided mirrors was discussed a while back in the 
availability thread (warning, it's long:  
http://www.opensolaris.org/jive/message.jspa?messageID=311743).  There are many 
people who want ZFS to support lop sided mirrors, and I've tried to raise an 
RFE twice, but I don't believe I've ever seen a bug ID for it, it seems to get 
lost in the system.

And yes, the resilver estimates are all over the place, even when you don't 
allow for the fact that it still restarts regularly.  Would you be able to get 
away with just reporting the percentage complete and ignoring the estimates 
until that code is improved?

Ross
-- 
This message posted from opensolaris.org
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Miles Nordin
 np == Niall Power niall.po...@sun.com writes:

np So I'd like to ask if this is an appropriate use of ZFS mirror
np functionality?

I like it a lot.

I tried to set up something like that ad-hoc using a firewire disk on
an Ultra10 at first, and then, just as you thought, tried using one
firewire disk and one iSCSI disk to make the mirror.  It was before
ZFS boot, so I mirrored /usr and /var only with ZFS, and / with SVM
(internal 2GB scsi to firewire).  I was trying to get around the 128GB
PATA limitation in the Ultra 10.  It was a lot of sillyness, but it
was still useful even though I ran into a lot of bugs that have been
fixed since I was trying it.  The stuff you successfully tested
explores a lot of the problem areas I had---hangs on disconnecting,
incomplete resilvering, both sound fixed---but iSCSI still does not
work well because the system will ``patiently wait'' forever during
boot for an absent iSCSI target.  On SPARC neither firewire nor iscsi
was bootable back then, so you're in a much better spot there too than
I was with only a single bootable SVM component and a lot of painful
manual rescue work to do if that failed.

From reading the list you might be able to do something similar with
the storagetek AVS/ii/geo-cluster stuff, but I haven't tried it and
remember some problem with running it on localhost---I think you need
two machines, just because of UI limitations.  It might resilver
faster than ZFS though, and it's always a Plan B if you run into a
show-stopper.  Also (if it worked at all) it solves the
slower-performacne-while-connected problem.

In the long run some USB stick problems may surface because the wear
leveling is done in 16MB sections, and you could blow your stick if
you have a 16MB region which is ``hot''.  I wonder if parts of a zpool
are hotter than others?  With AVS the dirty bitmap might be hot.

I guess you are not erally imagining sticks though, just testing with
them.  You're imagining something more like the time capsule, where
the external drive is bigger than the internal one, that it'll be used
more on laptops.  At home you keep a large, heavy disk which holds a
mirror of your laptop ZFS root on one slice, plus an unredundant
scratch pool made of the extra free space.

Finally, I still don't understand the ZFS quorum rules.  What happens
if you:

  (1) boot the internal disk, change some stuff, shut down.  

  (2) Then boot the USB-stick/big-home-disk, change some stuff, shut down.

  (3) Then boot with both disks.

corruption or successful scrub?  Which changes survive?  because
people WILL do that.  some will not even remember that they did it,
will even lie and deny it.


pgpj7ilWM9RIj.pgp
Description: PGP signature
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-16 Thread Tim
On Tue, Dec 16, 2008 at 1:53 PM, Miles Nordin car...@ivy.net wrote:

  np == Niall Power niall.po...@sun.com writes:

np So I'd like to ask if this is an appropriate use of ZFS mirror
np functionality?

 I like it a lot.

 I tried to set up something like that ad-hoc using a firewire disk on
 an Ultra10 at first, and then, just as you thought, tried using one
 firewire disk and one iSCSI disk to make the mirror.  It was before
 ZFS boot, so I mirrored /usr and /var only with ZFS, and / with SVM
 (internal 2GB scsi to firewire).  I was trying to get around the 128GB
 PATA limitation in the Ultra 10.  It was a lot of sillyness, but it
 was still useful even though I ran into a lot of bugs that have been
 fixed since I was trying it.  The stuff you successfully tested
 explores a lot of the problem areas I had---hangs on disconnecting,
 incomplete resilvering, both sound fixed---but iSCSI still does not
 work well because the system will ``patiently wait'' forever during
 boot for an absent iSCSI target.  On SPARC neither firewire nor iscsi
 was bootable back then, so you're in a much better spot there too than
 I was with only a single bootable SVM component and a lot of painful
 manual rescue work to do if that failed.

 From reading the list you might be able to do something similar with
 the storagetek AVS/ii/geo-cluster stuff, but I haven't tried it and
 remember some problem with running it on localhost---I think you need
 two machines, just because of UI limitations.  It might resilver
 faster than ZFS though, and it's always a Plan B if you run into a
 show-stopper.  Also (if it worked at all) it solves the
 slower-performacne-while-connected problem.

 In the long run some USB stick problems may surface because the wear
 leveling is done in 16MB sections, and you could blow your stick if
 you have a 16MB region which is ``hot''.  I wonder if parts of a zpool
 are hotter than others?  With AVS the dirty bitmap might be hot.

 I guess you are not erally imagining sticks though, just testing with
 them.  You're imagining something more like the time capsule, where
 the external drive is bigger than the internal one, that it'll be used
 more on laptops.  At home you keep a large, heavy disk which holds a
 mirror of your laptop ZFS root on one slice, plus an unredundant
 scratch pool made of the extra free space.

 Finally, I still don't understand the ZFS quorum rules.  What happens
 if you:

  (1) boot the internal disk, change some stuff, shut down.

  (2) Then boot the USB-stick/big-home-disk, change some stuff, shut down.

  (3) Then boot with both disks.

 corruption or successful scrub?  Which changes survive?  because
 people WILL do that.  some will not even remember that they did it,
 will even lie and deny it.



I did similar, although I can't say I did extensive testing.  When verifying
that both drives were working properly, I simply pulled one, booted, checked
around to make sure the system was fine, then halted.  Pulled that drive and
put the other one in, made sure everything came up fine, halted.  Finally
booted with both in and did a scrub.  It did scrub, and it did do so
correctly.  I guess I didn't actually verify which ones data was kept.  I
know things like the messages file had to be different across systems... so
that is an interesting question.

As for the hanging, and forgive me if he said this as I've not read the OP's
post, but couldn't you simply do a detach before removing the disk, and do a
re-attach everytime you wanted to re-mirror?  Then there'd be no hanging
involved with a missing device on boot, as it would assume it's got every
disk in the pool.  When you re-attach it, at least when I've tested this, it
appears to acknowledge the data on disk and simply scrub the changes since
it was last attached.  Maybe what I saw was a fluke though.

---Tim
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss


Re: [zfs-discuss] Using zfs mirror as a simple backup mechanism for time-slider.

2008-12-15 Thread Volker A. Brandt
 A  while back, I posted here about the issues ZFS has with USB hotplugging
 of ZFS formatted media when we were trying to plan an external media backup
 solution for time-slider:
 http://www.opensolaris.org/jive/thread.jspa?messageID=299501
[...]

 There are a few minor issues however which I'd love to get some feedback on 
 in addition
 to the overall direction of this proposal:

 1. When the external device is disconnected, the zpool status output reports 
 that the
 pool is in a degraded state and displays a status message that indicates 
 that there
 was an unrecoverable error. While this is all technically correct, and is 
 appropriate
 in the context of a setup where it is assumed that the mirrored device is 
 always
 connected, it might lead a user to be unnecessarily alarmed when his 
 backup mirror
 disk is not connected. We're trying to use a mirror configuration here in 
 a manner that
 is a bit different than the conventional manner, but not in any way that 
 it's not designed
 to cope with.
[...]

 So I'd like to ask if this is an appropriate use of ZFS mirror functionality? 
 It has many benefits
 that we really should take advantage of.

Yes, by all means.  I am doing something very similar on my T1000, but
I have two separate one-disk pools and copy to the backup pool using
rsync.  I would very much like to replace this with automatic resilvering.

One prerequisite for wide adoption would be to fix the issue #1 you
described above.  I would advise not to integrate this anywhere before
fixing that degraded display.

BTW is this USB-specific?  While it seems to imply that, you don't state
it anywhere explicitly.  I attach my backup disk via eSATA, power it up,
import the pool, etc.  Not really hotplugging...


Regards -- Volker
-- 

Volker A. Brandt  Consulting and Support for Sun Solaris
Brandt  Brandt Computer GmbH   WWW: http://www.bb-c.de/
Am Wiesenpfad 6, 53340 Meckenheim Email: v...@bb-c.de
Handelsregister: Amtsgericht Bonn, HRB 10513  Schuhgröße: 45
Geschäftsführer: Rainer J. H. Brandt und Volker A. Brandt
___
zfs-discuss mailing list
zfs-discuss@opensolaris.org
http://mail.opensolaris.org/mailman/listinfo/zfs-discuss