Re: request: script for generating DLEs

Brian Cuttler Thu, 22 Apr 2010 11:08:54 -0700

Hello Stefan,

On Thu, Apr 22, 2010 at 07:01:44PM +0200, Stefan G. Weichinger wrote:
> Am 21.04.2010 19:55, schrieb Brian Cuttler:
> > 
> > Stefan,
> > 
> > I don't have what you need, I do have something very specific
> > and fairly ugly.
> > 
> > We have a large ZFS pool, we have unique mount points for each
> > user directory and for samba shares. Maintaining a disklist for
> > this system would be administratively difficult.
> 
> [...]
> 
> I now found the time and brain to read through your scripts ... looks
> quite usable to me.


Has some cruft in it, mostly the script is comments. I wish
I had anywhere near the facility with shell scripts that I
have with DCL, which I almost never use anymore.

> As you mentioned it is not what I am looking for as it scratches a
> different itch ...
> 
> Additional questions:
> 
> * How do you handle removed zfs-filesystems?
> 
> If user simpson_bart is removed and his zfs-fs as well, this would
> result in your scripts removing the relevant DLE as well (if you accept
> the change).

The fetch and filter of the live disklist file removes all of
the "samba" shares for the specified host (lore) but leaves
all other records alone. This is why I can have the user home
directories in the same pool but not be affected by the filtering
I am performing.

Basically I only create single line entries for the samba shares
I am adding and the first thing I do the next day is remove those
very same lines.

This turns out to be very important for me because the number of
home directories is large and going to become very large and I
needed to glob them hence multi-line DLE entries for the same host
(though I'm currently unable to use zfs-snapshots with a glob,
I do use snapshots with the samba shares).

Oh, I also use TCP protocal between the server and client since
the UPD didn't allow me to handle the number of DLEs I have. The
changes turn out to be fairly straight forward and affect the
dumptype, inetd.conf... but they where well documented and once
I understood what I was supposed to do had little or no trouble
getting it to work (thank you Dustin).

If I was writing multi-line DLE I'd probably keep my more static
DLE in a template file and append the two files (static and the
one created daily by the script) each day.

Does disklist have an 'include' additional file statement, that
would be helpfull (and cleaner) in this situation.

Per the simpson_bart example. If this was a samba share then yes,
it would be removed from the disklist. If it was a home directory
for the user, then no changes to amanda occur since the user's
home directories are globbed.

er, let me include a snipped of the disklist here.

Imagine an otherwise normal disklist file.
then I itterate through the home directories.
Then my script handles the samba shares present on the client.

The script and adjunct code _ignores_ everything in the disklist
except the "finsen" and "samba" stuff and only generates these
specific single line entries.

 :
 :
finsen  /export/home-H /export/home   {
        user-tar2
        include "./[h]*"
        }
finsen  /export/home-I /export/home   {
        user-tar2
        include "./[i]*"
        }
 :
 :
finsen  /export/samba   zfs-snapshot2
finsen  /export/samba/abbackup  zfs-snapshot2
finsen  /export/samba/aiadm     zfs-snapshot2
finsen  /export/samba/annemlab  zfs-snapshot2
finsen  /export/samba/antimicr  zfs-snapshot2
finsen  /export/samba/bdlshare  zfs-snapshot2
 :
 :

> Doesn't that mean that you can't access the backups via amrecover
> anymore? (ok, I could test that myself here)

We have yet to remove a samba share from this system. While the
script will remove it easilly from the disklist file we take no
steps to remove the records from the index file or any other
amanda database.

I assume I can use amadmin find and amfetchdump even if there
is no corresponding DLE entry in the disklist file. Worst case
I'd position the tape manually and use amrestore to recover the
dump (actually gtar) file.

> * You seem not to have to take care of the size of one single DLE
> created by your scripts?

I'm asked to backup everything. Since I'm fortunate enough to be
using an SL24/LTO4 library I don't have to worry about the size
of the individual DLE.

I have to say though that it took quite a lot of time and repeated
sit downs with other managers here to help them understand that
_HUGE_ DLEs just don't backup efficiently. meaning not that amanda
will not do it, but that it requires even larger work areas to allow
concurrency and longer wall clock time because even if we can dump
to the holding area it takes a long time to dump large files to
tape, even a fast tape. We are much better off dumping multiple
files of more reasonable size, else you lost the concurrency of
writing to and reading from the work area.

So, "no", beyond initially structuring the disk space we aren't
doing any selection based on file size.

> AFAI remember you use quite big libraries for backups, so I assume your
> zfs-fs can't get too big ... and I also assume you define zfs-quotae ...

We are moving from a 'older' smaller zfs pool of only 1Tbyte to
a new pool of 5.5Tbytes. This is why we currently have mount points
of /export and /export2... we had things stepping on each other
before we untangled that.

Yes, we do have quotas on individual user directories and on each
of the samba shares, this was one of the driving forces behind
each home directory and each samba share having a unique mount point.
We cap'd the size of each 'file system', I wonder if we couldn't
have quota'd each user on a single file system...

... that would have the advantage of a single DLE to backup while
maintaining user quotas, but the downside of having a single DLE
that was very large.

> This might be some useful feature for your setup, something that reads
> in the used space of each zfs-DLE and triggers a warning if it's bigger
> than some defined value.

I like the idea, but really the user 'data directories' are elsewhere
and by and large the login directories are not that large. With a few
exceptions of course, sys admins that perform builds in their login
directories and have old build kits that go back years, or download
patch clusters to places where they shouldn't be.

Should be pretty simple to pull/parse the disk usage out of the
# zfs list output and decide how to handle it based on some
threshhold value.

> > Needless to say the daily mail generated with changes to the
> > disklist is ignored by everyone but myself and I have to run
> > the get and accept scripts myself.
> 
> What else should we expect?

Sad, but I'm going to want to go on holiday at some point, and
they are going to find that their new samba share suffered an
end user error but didn't get added to the disklist...

-While talking about the volume of data, I will meantion that
 we recently noted that zfs snapshot dumps where about the same
 size for incrementals as for level 0s.

Dustin identified for us that this was because the snapshots
gave the files different device ids and thus appeared new.

The solution (again, thanks Dustin) was a newer version of gtar
and an new/additional switch in the dumptype that programmed
gtar to ignore device id number in file selection criteria.

This saved us a HUGE amount of time/space each night across
multiple systems, not only this client and its server but two
other systems and zfs snapshots are in use.

Not sure if I've been articulate or correctly addressed your
questions. Please let me know and I'll take another pass at this.

> ;-)
> 
> Thank you, Stefan

any time,

Brian

---
   Brian R Cuttler                 [email protected]
   Computer Systems Support        (v) 518 486-1697
   Wadsworth Center                (f) 518 473-6384
   NYS Department of Health        Help Desk 518 473-0773



IMPORTANT NOTICE: This e-mail and any attachments may contain
confidential or sensitive information which is, or may be, legally
privileged or otherwise protected by law from further disclosure.  It
is intended only for the addressee.  If you received this in error or
from someone who was not authorized to send it to you, please do not
distribute, copy or use it or any attachments.  Please notify the
sender immediately by reply e-mail and delete this from your
system. Thank you for your cooperation.

Re: request: script for generating DLEs

Reply via email to