from:"Skylar Thompson"

Re: Fiddling with exclude statement

2024-02-26 Thread Skylar Thompson

Do you have DIRMC set? My experience is that supersedes any normal exclude 
rules, though
would only backup the directories and not their contents. I think
exclude.dir would work to exclude that path and avoid the directories being
backed up, though.

On Mon, Feb 26, 2024 at 03:59:47PM +, Loon, Eric van (ITOP DI) - KLM wrote:
> Hi everybody,
>
> I'm trying to exclude a directory called /tech/splunk (including all 
> subdirectories) through a client option set. The exclude on the client seem 
> to be processed OK:
>
> [root@hostname ~]# dsmc q inclexcl|grep splunk
> Exclude All   /tech/splunk/.../* Server
>
> But when I run an incremental on /tech, I do see subdirectories being backed 
> up:
>
> Incremental backup of volume '/tech'
> Directory-->   1,024 /tech/splunk/etc [Sent]
> Directory-->   1,024 /tech/splunk/var [Sent]
> Directory-->   1,024 /tech/splunk/var/run [Sent]
> Successful incremental backup of '/tech'
>
> I don't understand why, what am I doing wrong here? Thanks for any help in 
> advance!
>
>
> Kind regards,
>
> Eric van Loon
>
> Air France/KLM Core Infra
> 
> For information, services and offers, please visit our web site: 
> https://urldefense.com/v3/__http://www.klm.com__;!!K-Hz7m0Vt54!g42kTl8HJ8JxbVJ1A40Zb9fXV8E9lnOPTSU5QvlEdQAVjSL3ipdypfRwLqD5zxRFgyzCjalDnKMBFkvzrNrVPw$
>  . This e-mail and any attachment may contain confidential and privileged 
> material intended for the addressee only. If you are not the addressee, you 
> are notified that no part of the e-mail or any attachment may be disclosed, 
> copied or distributed, and that any other action related to this e-mail or 
> attachment is strictly prohibited, and may be unlawful. If you have received 
> this e-mail by error, please notify the sender immediately by return e-mail, 
> and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: Servergraph alternatives

2023-10-30 Thread Skylar Thompson

Thanks, Eric, I've had other endorsements of TSM Manager so definitely will
be on the list. Servergraph's UI can be overwhelming too, so I won't hold
it against TSM Manager. :)

On Thu, Oct 26, 2023 at 06:45:13AM +, Loon, Eric van (ITOP DI) - KLM wrote:
> Hi Skylar,
>
> I'm using TSMManager for years. It gives me so much more insights. I have 
> encountered several unexpected growth issues in the past, which I would not 
> have been able to resolve, without the historical data collected by 
> TSMManager. The footprint is small and you can manage multiple servers with 
> one single collector. Don't let the GUI (which can be a bit overwhelming) 
> scare you off; when you click around for a few minutes, you will quickly 
> discover the logic behind it.
> I think you can use the product without a license for one month, so just have 
> a look at it.
>
> Kind regards,
> Eric van Loon
> Air France/KLM Core Infra
>
> -Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: dinsdag 24 oktober 2023 19:09
> To: ADSM-L@VM.MARIST.EDU
> Subject: Servergraph alternatives
>
> We've been a long-time Servergraph customer to do monitoring, reporting and 
> billing for our TSM services, and just received the announcement that Rocket 
> will be ending development at the end of May 2025. What are other Servergraph 
> customers planning on moving to? Servergraph hasn't been perfect (the upgrade 
> from v7 to v9 was particularly challenging) but overall it's been a great 
> product with good support, and we're hoping to find something off-the-shelf 
> rather than rolling our own solution.
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department (UW Medicine), System Administrator
> -- Foege Building S046, (206)-685-7354
> -- Pronouns: He/Him/His
> 
> For information, services and offers, please visit our web site: 
> https://urldefense.com/v3/__http://www.klm.com__;!!K-Hz7m0Vt54!na8uxnGJakZuqxZdGA6DwE7jZHdvNDTzOLWZYTLzaKF6BoJSZWcBr3KoPw-A4sw51aSmGomjR-hWxzNRt_QSAw$
>  . This e-mail and any attachment may contain confidential and privileged 
> material intended for the addressee only. If you are not the addressee, you 
> are notified that no part of the e-mail or any attachment may be disclosed, 
> copied or distributed, and that any other action related to this e-mail or 
> attachment is strictly prohibited, and may be unlawful. If you have received 
> this e-mail by error, please notify the sender immediately by return e-mail, 
> and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Servergraph alternatives

2023-10-24 Thread Skylar Thompson

We've been a long-time Servergraph customer to do monitoring, reporting and
billing for our TSM services, and just received the announcement that Rocket
will be ending development at the end of May 2025. What are other
Servergraph customers planning on moving to? Servergraph hasn't been
perfect (the upgrade from v7 to v9 was particularly challenging) but
overall it's been a great product with good support, and we're hoping to
find something off-the-shelf rather than rolling our own solution.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: TSM db backup delayed or hung for more than few hours

2023-05-30 Thread Skylar Thompson

Ah, OK, I was thinking in relation to tape devices, I wouldn't expect SAN
discovery to be in play for disk, or at least not in a way that's visible
to TSM.

Definitely let the list know what the results of the PMR are, I suspect
many of us will be curious...

On Tue, May 30, 2023 at 11:52:14PM +0800, Saravanan Palanisamy wrote:
> Hi Skylar
>
> We use disk for db backup and it will finish within 30 minutes once db
> backup started at db2 level for 2 tb database.
>
> There is no errors or warnings reported during db backup and only concern
> it takes more than 30 min to see db backup message in db2diag log though
> tsm side backup db command issued.
>
> We just need to ensure db backup get complete two times to clear all the
> logs.
>
> Sometime we noticed there was no db backup message in db2 diag log and db
> backup process hang with 0 bytes even after  12 hours.
>
> There is more stress for us to monitor db backup process very closely as it
> may get hung anytime.
>
> Our data fully resides on all flash disk ( db , active , archive and
> directory container )
>
>
>
>
>
> On Tue, 30 May 2023 at 11:29 PM, Skylar Thompson  wrote:
>
> > What's the device type for the DB backup? We'll see this when SAN discovery
> > hasn't caught up with new device paths after a replacement. The telltale
> > will be ANR8975I messages on the library manager and ANR8974I messages on
> > the clients.
> >
> > As others noted, though, it can also be an API package problem. It might be
> > fruitful looking in the activity log around the time backups start, along
> > with the db2diag.log and tsmdbmgr.log files.
> >
> > On Mon, May 29, 2023 at 05:16:50AM +0800, Saravanan Palanisamy wrote:
> > > !---|
> > >   This Message Is From an Untrusted Sender
> > >   You have not previously corresponded with this sender.
> > >   See https://itconnect.uw.edu/email-tags for additional
> > >   information.Please contact the UW-IT Service Center,
> > >   h...@uw.edu 206.221.5000, for assistance.
> > > |---!
> > >
> > > V8.1.18
> > >
> > > We have noticed tsm backup started at tsm level but  didn???t issue any
> > message to db2 and tsm level db backup appears hung
> > >
> > > db2 list utilities show details - No process
> > > db2diag.log - didn???t see any message like ( Starting an online db
> > backup )
> > >
> > > Tried to restart tsm server to clear db2 hung process but no luck and db
> > backup again hung.
> > >
> > > Main issue here :
> > >
> > > When we start tsm db backup it???s not able start and archive log became
> > full. Tried to increase archive log few times.
> > >
> > > We tried almost 3 times ( we waited more than 1 hour ) but no luck.
> > >
> > > we always run db backup to clear archive log so that server will up and
> > running.
> > >
> > > Same issue observed on last week in different server and today one more
> > server.
> > >
> > > We have raised PMR to check this issue.
> > >
> > > Did anyone face this issue in v8.1.18 ?
> > >
> > > We have never seen this issue earlier but noticed after upgrade to
> > v8.1.18
> > >
> > >
> > > Usually when we issue db backup command it will immediately start and we
> > will see tsmdbmgr sessions in tsm server.  It looks like strange issue and
> > causing outage for production jobs.
> > >
> > >
> > > Regards
> > > Sarav
> > > +65 9857 8665
> >
> > --
> > -- Skylar Thompson (skyl...@u.washington.edu)
> > -- Genome Sciences Department (UW Medicine), System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- Pronouns: He/Him/His
> >
> --
> Thanks & Regards,
> Saravanan
> Mobile: +65-8228 4384

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: TSM db backup delayed or hung for more than few hours

2023-05-30 Thread Skylar Thompson

What's the device type for the DB backup? We'll see this when SAN discovery
hasn't caught up with new device paths after a replacement. The telltale
will be ANR8975I messages on the library manager and ANR8974I messages on
the clients.

As others noted, though, it can also be an API package problem. It might be
fruitful looking in the activity log around the time backups start, along
with the db2diag.log and tsmdbmgr.log files.

On Mon, May 29, 2023 at 05:16:50AM +0800, Saravanan Palanisamy wrote:
> !---|
>   This Message Is From an Untrusted Sender
>   You have not previously corresponded with this sender.
>   See https://itconnect.uw.edu/email-tags for additional
>   information.Please contact the UW-IT Service Center,
>   h...@uw.edu 206.221.5000, for assistance.
> |---!
>
> V8.1.18
>
> We have noticed tsm backup started at tsm level but  didn???t issue any 
> message to db2 and tsm level db backup appears hung
>
> db2 list utilities show details - No process
> db2diag.log - didn???t see any message like ( Starting an online db backup )
>
> Tried to restart tsm server to clear db2 hung process but no luck and db 
> backup again hung.
>
> Main issue here :
>
> When we start tsm db backup it???s not able start and archive log became 
> full. Tried to increase archive log few times.
>
> We tried almost 3 times ( we waited more than 1 hour ) but no luck.
>
> we always run db backup to clear archive log so that server will up and 
> running.
>
> Same issue observed on last week in different server and today one more 
> server.
>
> We have raised PMR to check this issue.
>
> Did anyone face this issue in v8.1.18 ?
>
> We have never seen this issue earlier but noticed after upgrade to v8.1.18
>
>
> Usually when we issue db backup command it will immediately start and we will 
> see tsmdbmgr sessions in tsm server.  It looks like strange issue and causing 
> outage for production jobs.
>
>
> Regards
> Sarav
> +65 9857 8665

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: Good Bye

2023-05-24 Thread Skylar Thompson

Glad to have had you as a TSM colleague across the continent! Good luck
with whatever comes next.

On Wed, May 24, 2023 at 04:01:04PM -0400, Zoltan Forray wrote:
> Folks,
>
> It has been a fun, wild, sometimes entertaining, exasperating, frustrating,
> exhausting (insert your adjective/adverb) learning adventure working with
> DSF / ADSM / TSM / Spectrum Protect / whatever-the-new-name-is, for the
> past 30+ years!
>
> But, as they say, "*The Times They Are A-Changin*". After more than a year
> of analysis, comparisons, discussions, demonstrations, meetings, etc, it
> was decided to transition away from IBM/ISP to a new "Enterprise Data
> Protection" solution.
>
> The transition is well underway and we expect to (must) be completely off
> ISP by the end of 2023.
>
> Thank you for all the support I/we have received from this
> mailing-list/forums contributors.  There are some truly stellar, gifted
> individuals here!
>
> SIGNING OFF
>
> --
> *Zoltan Forray*
> Enterprise Data Protection Administrator
> VMware Systems Administrator
> Enterprise Compute & Storage Platforms Team
> VCU Infrastructure Services
> zfor...@vcu.edu - 804-828-4807

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: Listing all nodes without filespaces

2023-03-03 Thread Skylar Thompson

Hi Eric,

This should work:

select n.node_name from nodes n left join filespaces f on 
n.node_name=f.node_name where f.node_name is null

On Fri, Mar 03, 2023 at 02:49:20PM +, Loon, Eric van (ITOP DI) - KLM wrote:
> Hi everybody,
>
> On one of my servers I'm trying to generate a list of nodes without 
> filespaces. I know for sure such node exist:
>
> Protect: >q files LH-PPPC01-DB
> ANR2034E QUERY FILESPACE: No match found using this criteria.
> ANS8001I Return code 11.
>
> But if I try to generate this list through a SQL query, it returns no nodes:
>
> Protect: >select node_name from nodes where node_name not in (select 
> node_name from filespaces)
> ANR2034E SELECT: No match found using this criteria.
> ANS8001I Return code 11.
>
> I don't understand what I'm doing wrong here, has anybody an idea? Thanks for 
> your help in advance!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Core Infra
>
> 
> For information, services and offers, please visit our web site: 
> https://urldefense.com/v3/__http://www.klm.com__;!!K-Hz7m0Vt54!nuujW6lYNa-NPva2oaAtNWzV-CN6_p_gJWt39S4Kh_MzOKUpZc59lOJxdCFRM6VSrCcdOiwDIlw5aITLt1ngxQ$
>  . This e-mail and any attachment may contain confidential and privileged 
> material intended for the addressee only. If you are not the addressee, you 
> are notified that no part of the e-mail or any attachment may be disclosed, 
> copied or distributed, and that any other action related to this e-mail or 
> attachment is strictly prohibited, and may be unlawful. If you have received 
> this e-mail by error, please notify the sender immediately by return e-mail, 
> and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: Removing tape devices

2023-02-02 Thread Skylar Thompson

Can you confirm that the serial_num attribute that udev sees matches the
contents of /proc/scsi/IBMtape?

udevadm info --attribute-walk --name /dev/IBMtape5|grep serial_num

If it does, you might try restarting udevd, though I've had to resort to
reboot when udev just doesn't want to notice that something has changed.

On Thu, Feb 02, 2023 at 04:14:33PM +, Loon, Eric van (ITOP DI) - KLM wrote:
> Hi Skylar,
>
> Yes, all drives are LTO drives (SAN attached). I tried your procedure, but 
> it's not working. The six tape devices are all actually there:
>
> # ls -al /dev/IBMtap*
> crw--- 1 root root 235,0 Feb  2 17:04 /dev/IBMtape0
> crw--- 1 root root 235, 1024 Feb  2 17:04 /dev/IBMtape0n
> crw--- 1 root root 235,1 Feb  2 17:04 /dev/IBMtape1
> crw--- 1 root root 235, 1025 Feb  2 17:04 /dev/IBMtape1n
> crw--- 1 root root 235,2 Feb  2 17:04 /dev/IBMtape2
> crw--- 1 root root 235, 1026 Feb  2 17:04 /dev/IBMtape2n
> crw--- 1 root root 235,3 Feb  2 17:04 /dev/IBMtape3
> crw--- 1 root root 235, 1027 Feb  2 17:04 /dev/IBMtape3n
> crw--- 1 root root 235,4 Feb  2 17:04 /dev/IBMtape4
> crw--- 1 root root 235, 1028 Feb  2 17:04 /dev/IBMtape4n
> crw--- 1 root root 235,5 Feb  2 17:04 /dev/IBMtape5
> crw--- 1 root root 235, 1029 Feb  2 17:04 /dev/IBMtape5n
>
> But one persistent named device is just not created:
>
> # ls -l /dev/lin_tape/by-id/
> total 0
> lrwxrwxrwx 1 root root 17 Feb  2 17:04 changer0 -> ../../IBMchanger4
> lrwxrwxrwx 1 root root 14 Feb  2 17:04 drive1 -> ../../IBMtape5
> lrwxrwxrwx 1 root root 14 Feb  2 17:04 drive2 -> ../../IBMtape2
> lrwxrwxrwx 1 root root 14 Feb  2 17:04 drive3 -> ../../IBMtape3
> lrwxrwxrwx 1 root root 14 Feb  2 17:04 drive4 -> ../../IBMtape0
> lrwxrwxrwx 1 root root 14 Feb  2 17:04 drive5 -> ../../IBMtape4
>
> I'm missing drive6 for some reason. I double-checked the serials in the 
> 98-lin_tape.rules file and compared them to the output of the cat 
> /proc/scsi/IBMtape command and they are all correct...
> Thanks for your help!
>
> Kind regards,
> Eric van Loon
>
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: Thursday, 2 February 2023 16:56
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: Removing tape devices
>
> Are these all IBM LTO drives using the lin_tape driver? My experience on
> RHEL7 w/ FC-attached drives is that the devices show up automatically, though 
> I can't remember if it removes them if the drive falls off the fabric. I 
> suspect not, just so that a transient problem doesn't leave the system in an 
> inconsistent state. You might try unloading the kernel module to get rid of 
> the devices:
>
> modprobe -r lin_tape
>
> And then ask udev to reprocess its rules and clean up device links it can't
> find:
>
> udevadm trigger
>
> That assumes the WWNs of the drives in your udev rules don't need updating.
>
> At that point, you ought to be able to re-load lin_tape:
>
> modprobe lin_tape
>
> If you're having trouble getting the drives to show up afterwards, you might 
> try a FC LIP:
>
> echo 1 > /sys/class/fc_host/hostX/issue_lip
>
> Followed by a SCSI rescan:
>
> echo "- - -" > /sys/class/scsi_host/hostN/scan
>
> On Thu, Feb 02, 2023 at 03:38:21PM +, Loon, Eric van (ITOP DI) - KLM 
> wrote:
> > Hi everybody,
> >
> > I'm having some issues with missing persistent names after a HBA 
> > replacement on a Linux server. I like to remove all tape devices, so all 
> > /dev/IBMTape*, all /dev/lin_tape/by-id/drive* devices and the SCSI devices. 
> > What is the proper way to do that in Linux?
> > Afterwards, I like to run a SCSI rescan to have everything redefined, so I 
> > can start with a clean sheet.
> > Thanks for any help in advance!
> >
> > Kind regards,
> > Eric van Loon
> > Air France/KLM
> > 
> > For information, services and offers, please visit our web site: 
> > https://urldefense.com/v3/__http://www.klm.com__;!!K-Hz7m0Vt54!mj3-YqICmARjV6b_WPIXbqoM-r_YP44rh-TgAQqwCFuCBlztgWcHLVxjPSTXJa99xAcg2QUJQVSJle1GprJKpQ$
> >  . This e-mail and any attachment may contain confidential and privileged 
> > material intended for the addressee only. If you are not the addressee, you 
> > are notified that no part of the e-mail or any attachment may be disclosed, 
> > copied or distributed, and that any other action related to this e-mail or 
> > attachment is strictly prohibited, and may be unlawful. If you have 
> > received this e-mail by error, please notify the sender immediately by 
>

Re: Removing tape devices

2023-02-02 Thread Skylar Thompson

Are these all IBM LTO drives using the lin_tape driver? My experience on
RHEL7 w/ FC-attached drives is that the devices show up automatically,
though I can't remember if it removes them if the drive falls off the
fabric. I suspect not, just so that a transient problem doesn't leave the
system in an inconsistent state. You might try unloading the kernel module
to get rid of the devices:

modprobe -r lin_tape

And then ask udev to reprocess its rules and clean up device links it can't
find:

udevadm trigger

That assumes the WWNs of the drives in your udev rules don't need updating.

At that point, you ought to be able to re-load lin_tape:

modprobe lin_tape

If you're having trouble getting the drives to show up afterwards, you
might try a FC LIP:

echo 1 > /sys/class/fc_host/hostX/issue_lip

Followed by a SCSI rescan:

echo "- - -" > /sys/class/scsi_host/hostN/scan

On Thu, Feb 02, 2023 at 03:38:21PM +, Loon, Eric van (ITOP DI) - KLM wrote:
> Hi everybody,
>
> I'm having some issues with missing persistent names after a HBA replacement 
> on a Linux server. I like to remove all tape devices, so all /dev/IBMTape*, 
> all /dev/lin_tape/by-id/drive* devices and the SCSI devices. What is the 
> proper way to do that in Linux?
> Afterwards, I like to run a SCSI rescan to have everything redefined, so I 
> can start with a clean sheet.
> Thanks for any help in advance!
>
> Kind regards,
> Eric van Loon
> Air France/KLM
> 
> For information, services and offers, please visit our web site: 
> https://urldefense.com/v3/__http://www.klm.com__;!!K-Hz7m0Vt54!mj3-YqICmARjV6b_WPIXbqoM-r_YP44rh-TgAQqwCFuCBlztgWcHLVxjPSTXJa99xAcg2QUJQVSJle1GprJKpQ$
>  . This e-mail and any attachment may contain confidential and privileged 
> material intended for the addressee only. If you are not the addressee, you 
> are notified that no part of the e-mail or any attachment may be disclosed, 
> copied or distributed, and that any other action related to this e-mail or 
> attachment is strictly prohibited, and may be unlawful. If you have received 
> this e-mail by error, please notify the sender immediately by return e-mail, 
> and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: Fastest way to "query backup"

2022-09-09 Thread Skylar Thompson

While you could query the BACKUPS or CONTENTS tables via SQL on the SP
server-side, my experience is that this is no faster than QUERY BACKUP on
the client-side, and a lot more resource-intensive for the SP server. An
alternative would be to set these variables in your mmbackup shell
environment:

export MMBACKUP_DSMC_BACKUP="-auditlogging=full 
-auditlogname=/var/log/dsm-audit-_node-name_-_fs-name_-backup.log"
export MMBACKUP_DSMC_EXPIRE="-auditlogging=full 
-auditlogname=/var/log/dsm-audit-_node-name_-_fs-name_-expire.log"

You'll get messages like this in those files:

09/09/22   05:26:51 ANS1651I Backed Up: /gpfs/foo/bar
09/09/22   05:26:51 ANS1651I Failed: /gpfs/foo/invalid_name
09/07/22   20:50:48 ANS1657I Expired: /gpfs/foo/deleted_file

With a bit of work you can parse together what happened when as mmbackup
runs.

On Fri, Sep 09, 2022 at 02:01:15PM +0200, Martin Lischewski wrote:
> Hello everyone,
>
> it is my first time posting to this group, so I am sorry if this
> question is to trivial.
>
> We are using mmbackup to backup our Spectrum Scale filesystems. Because
> of some bad experience in the past we do not trust the output of mmbackup.
>
> Therefore we wrote our own scripts validating the backup state of our
> filesystems. Basically this is what we are doing:
>
> We compare the output of a policy run by gpfs policy engine on the
> filesystem with the output of: `dsmc query backup -subdir=yes -quiet
> `
>
> The policy engine listing all files is relatively fast. The `dsmc q ba`
> takes way longer. (3 hours for roughly 36_000_000 entries.)
>
> Can you recommend me a faster way to list all files which are in backup?
>
> Regards,
>
> Martin
>
>
>
>
>
> 
> 
> Forschungszentrum Juelich GmbH
> 52425 Juelich
> Sitz der Gesellschaft: Juelich
> Eingetragen im Handelsregister des Amtsgerichts Dueren Nr. HR B 3498
> Vorsitzender des Aufsichtsrats: MinDir Volker Rieke
> Geschaeftsfuehrung: Prof. Dr.-Ing. Wolfgang Marquardt (Vorsitzender),
> Karsten Beneke (stellv. Vorsitzender), Prof. Dr. Astrid Lambrecht,
> Prof. Dr. Frauke Melchior
> --------
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: NFS Storage used as directory container storage pool

2022-08-02 Thread Skylar Thompson

I agree, while we've done backups and restores of client data over NFS, I
would be very leery of using it as a backend for storage pool data. I
suspect you'll run into a conflict between what is safe (i.e. caching
policy for dirty buffers) and what will perform well. Using something like
NFS v4.1 might help, but given how new it is I imagine the bug density will
be high, along with the likelihood that the NFS clients and servers might
not agree on the protocol implementation.

On Tue, Aug 02, 2022 at 06:05:37PM +, Francisco Molero wrote:
>  Hi,?
> we had in the past very bad experiences with NFS. In my opinion it is not a 
> good idea.?
> Regards,
> Francisco J Garcia.?
>
>
> En martes, 2 de agosto de 2022, 15:49:48 CEST, Saravanan Palanisamy 
>  escribi?:
>
>  Hi Zoltan
>
> Thanks for your inputs. We have no plan to use as File device class and may
> be not good idea to use either directory container stg pool. Currently we
> are using S3 as target for long term storage and looking for some other
> alternatives as we have stored more than 8 PB and don???t want to be single
> archival storage.
>
> Regards
> Sarav
>
> On Tue, 2 Aug 2022 at 9:15 PM, Zoltan Forray  wrote:
>
> > We have had lots of issues using NFS (ISILON) storage as standard FILE
> > pools and would never try to use them for containers.? Currently we only
> > use them as overflow/next stgpools (vs active, ingest of backups) since we
> > simply don't have sufficient, proper storage to handle backups.? IBM's
> > official statement about NFS:
> >
> > https://www.ibm.com/support/pages/ibm-spectrum-protect-server-support-nfs
> >
> > On Tue, Aug 2, 2022 at 9:05 AM Marc Lanteigne 
> > wrote:
> >
> > > Hi Sarav,
> > >
> > > For Directory Container Pools, NFS mounted filesystems are not
> > > recommended. The nature of the file I/O of directory container pools
> > > doesn't work well with NFS mounted filesystems.
> > >
> > > If you plan to use Directory Container pools, I highly recommend that you
> > > follow the Blueprints, otherwise you may not be happy with the
> > performance.
> > > https://www.ibm.com/support/pages/ibm-spectrum-protect-blueprints
> > >
> > > -
> > > Thanks,
> > > Marc...
> > >
> > > Marc Lanteigne
> > > Spectrum Protect Specialist AVP / SRT
> > > IBM Systems, Spectrum Protect / Plus
> > > +1-506-460-9074
> > > marclantei...@ca.ibm.com
> > > Office Hours:? Monday to Friday, 7:00 to 15:30 Eastern
> > >
> > > IBM
> > >
> > > -Original Message-
> > > From: ADSM: Dist Stor Manager  On Behalf Of
> > > Saravanan Palanisamy
> > > Sent: Tuesday, August 2, 2022 09:45 AM
> > > To: ADSM-L@VM.MARIST.EDU
> > > Subject: [EXTERNAL] [ADSM-L] NFS Storage used as directory container
> > > storage pool
> > >
> > > Did anyone try using NFS file system as Directory container storage pool
> > ?
> > >
> > > is there any drawback to use such options?
> > >
> > > we have enough bandwidth 25G to support NFS mount points and its
> > dedicated
> > > for backup servers.
> > >
> > >
> > >
> > > Regards
> > > Sarav
> > > --
> > > Thanks & Regards,
> > > Saravanan
> > > Mobile: +65-8228 4384
> > >
> >
> >
> > --
> > *Zoltan Forray*
> > Enterprise Backup Administrator
> > VMware Systems Administrator
> > Enterprise Compute & Storage Platforms Team
> > VCU Infrastructure Services
> > www.ucc.vcu.edu
> > zfor...@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit http://phishing.vcu.edu/
> > <https://adminmicro2.questionpro.com>
> >
> --
> Thanks & Regards,
> Saravanan
> Mobile: +65-8228 4384
>

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: Files not rebinding as expected

2022-08-01 Thread Skylar Thompson

\...\*.* BA7Y_L
> INCLEXCL 2Yes INCLUDE.BACKUP 
> d:\FTPDATA\workathome\...\*.* BA7Y_L
> INCLEXCL 3Yes include.backup 
> d:\...\*.* BA14_L
> INCLEXCL 4Yes include.backup 
> c:\...\*.* BA14_L
>
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

mmbackup preview functionality

2021-12-16 Thread Skylar Thompson

I don't know if there's other folks out there using Spectrum Scale with
mmbackup to Spectrum Protect, but we've been frustrated by the fact that it
doesn't have a preview mode like the SP client has with PREVIEW BACKUP. I
submitted a RFE people can vote on if it's a useful feature for others:

https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe_ID=153520

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Oracle STK SL4000, ACSLS, and SP support

2021-10-11 Thread Skylar Thompson

Hi ADSM-L,

I have a question related to Spectrum Protect support of the STK SL4000 and
ACSLS.

We currently have a pair of STK SL3000s managed with ACSLS, and a single SP
library manager for the two of them for some other SP servers. We're
planning on upgrading the SL3000s to SL4000s, but it's unclear whether IBM
supports ACSLS with the SL4000s. In this device support matrix:

https://www.ibm.com/support/pages/node/716987

the SL3000 is listed twice, once in the LTO section and once in the ACSLS
section, while the SL4000 is *only* listed in the LTO section. Does this
mean the SL4000 is only support with the native lb (FC/SCSI) driver, or is
it just an oversight that it's not listed in the ACSLS section as well?
I've asked both Oracle and IBM this question, but thought I'd pose this
question here as well in case there's anyone in the field that can report
that it actually does work.

My technical instincts say it should be fine since SP only sees ACS IDs for
the libraries and drives, and has no knowledge of the underlying hardware,
but don't want to get to the end of the hardware upgrade and be left with
some esoteric incompatibility.

Thanks!

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: SQL statement for multiple classes

2021-03-12 Thread Skylar Thompson

Yeah, one of the limitations in SQL is that it's strongly-typed, which
means that the type of each query (including row structure) has to be known
before execution. Other database engines like PostgreSQL provide extensions
written in a procedural language (I think C) that work around this by
returning generic "row" types to emulate a pivot table, but I don't know
that DB2 has this functionality or if it even would be available via the
dsmadmc SQL interface.

Marc's answer with CASE will definitely work, though, if you the class
names are known before the query, but we have a bunch of management classes
and add new ones fairly regularly so the spreadsheet pivot table method
would work better for us.

On Fri, Mar 12, 2021 at 04:16:28PM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi Skylar,
>
> Your query returns multiple lines for the same node if a node uses multiple 
> classes:
>
> Node_name class_name  amount
> Nodename1 Class1  11231231
> Nodename1 Class2  31223
>
> I'm looking for a way to have everything in one line, like:
>
> Node_name Class1  Class2
> Nodename1 1123123131223
>
> Thanks for your help!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
>
> -----Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: vrijdag 12 maart 2021 16:49
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: SQL statement for multiple classes
>
> Hi Eric,
>
> You can part of the way there with GROUP BY:
>
> SELECT node_name,class_name,COUNT(*) FROM archives GROUP BY 
> node_name,class_name
>
> This will give you one row per (node_name,class_name) tuple. In order to get 
> row values as columns, though, you need to do a pivot (aka crosstab), which 
> I'm not sure is possible in through the DB2 interface in dsmadmc.
> If you run the query w/ "-dataonly=yes -tab" you can import into a 
> spreadsheet easily and do the pivot there, though.
>
> Hope that helps!
>
> On Fri, Mar 12, 2021 at 03:27:29PM +, Loon, Eric van (ITOP NS) - KLM 
> wrote:
> > Hi everybody,
> >
> > I'm trying to figure out how to create a SQL query to retrieve the amount 
> > of files, per management class, per node in just one query. The ideal 
> > output would be:
> >
> > Nodename  Class1   Class2   Class3
> > Mynode1 1234 475859 3645895
> > Mynode2 12345   274368746  8948382348
> >
> > If you select a count(*) from archives where class_name='CLASS1' you will 
> > only get the amount stored for Class1, so I will have to be able to combine 
> > multiple select count(*) from archives where statements in one single query.
> > Thank you very much for your help!
> >
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage & Backup
> > 
> > For information, services and offers, please visit our web site: 
> > http://www.klm.com. This e-mail and any attachment may contain confidential 
> > and privileged material intended for the addressee only. If you are not the 
> > addressee, you are notified that no part of the e-mail or any attachment 
> > may be disclosed, copied or distributed, and that any other action related 
> > to this e-mail or attachment is strictly prohibited, and may be unlawful. 
> > If you have received this e-mail by error, please notify the sender 
> > immediately by return e-mail, and delete this message.
> >
> > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> > employees shall not be liable for the incorrect or incomplete transmission 
> > of this e-mail or any attachments, nor responsible for any delay in receipt.
> > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
> > Dutch Airlines) is registered in Amstelveen, The Netherlands, with
> > registered number 33014286
> > 
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department (UW Medicine), System Administrator
> -- Foege Building S046, (206)-685-7354
> -- Pronouns: He/Him/His
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or di

Re: SQL statement for multiple classes

2021-03-12 Thread Skylar Thompson

Hi Eric,

You can part of the way there with GROUP BY:

SELECT node_name,class_name,COUNT(*) FROM archives GROUP BY node_name,class_name

This will give you one row per (node_name,class_name) tuple. In order to
get row values as columns, though, you need to do a pivot (aka crosstab),
which I'm not sure is possible in through the DB2 interface in dsmadmc.
If you run the query w/ "-dataonly=yes -tab" you can import into a
spreadsheet easily and do the pivot there, though.

Hope that helps!

On Fri, Mar 12, 2021 at 03:27:29PM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi everybody,
>
> I'm trying to figure out how to create a SQL query to retrieve the amount of 
> files, per management class, per node in just one query. The ideal output 
> would be:
>
> Nodename  Class1   Class2   Class3
> Mynode1 1234 475859 3645895
> Mynode2 12345   274368746  8948382348
>
> If you select a count(*) from archives where class_name='CLASS1' you will 
> only get the amount stored for Class1, so I will have to be able to combine 
> multiple select count(*) from archives where statements in one single query.
> Thank you very much for your help!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department (UW Medicine), System Administrator
-- Foege Building S046, (206)-685-7354
-- Pronouns: He/Him/His

Re: SQL query

2020-04-14 Thread Skylar Thompson

You're welcome! Glad that ended up working, despite not having a great way
to test it here. :)

On Tue, Apr 14, 2020 at 12:07:33PM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi Skylar,
>
> I had to change your suggestion a little bit, but this one is working:
>
> select a.node_name, case when b.count is null then 0 else b.count end as 
> count from nodes a left join (select node_name,count(*) as count from backups 
> where (days(current_date) - days(backup_date) >= 30) and 
> state='ACTIVE_VERSION' group by node_name) b on a.node_name=b.node_name
>
> Thank you VERY much for your help, I really appreciate it!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: donderdag 9 april 2020 20:21
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: SQL query
>
> I forgot that GROUP BY depended on having an entry in the result table.
> Unfortunately I don't have a TSM server with a reasonably-sized backups table 
> to test on (production ones are 1+ billion entries), so I'm kind of in 
> thought experiment territory right now, but what if you did an outer join 
> from the nodes table against a sub-query on the backups table? That would let 
> you replace the count for nodes without an entry in the sub-query with 0 with 
> CASE:
>
> select
>   a.node_name,
>   case when b.count is null
>   then 0
>   else b.count
>   end as count
> from nodes a
> left join (select node_name,count(*) from backups where
>   (days(current_date) - days(a.backup_date) >= 30)
>   and a.state='ACTIVE_VERSION'
>   group by node_name) b on a.node_name=b.node_name
>
>
> On Thu, Apr 09, 2020 at 10:53:09AM +, Loon, Eric van (ITOP NS) - KLM 
> wrote:
> > Hi Skylar,
> >
> > Sorry, but this one doesn't work either, it returns the same results as all 
> > others. I don't think the NULL result is the issue here, it seems to be the 
> > way the results are returned as soon as you select multiple columns. In the 
> > following example ,when I select just one, the result is 0:
> >
> > select count(*) from backups where node_name='RAC_098-ORC' and
> > days(current_date) - days(backup_date) >= 3000
> >
> >   Unnamed[1]
> > 
> >0
> >
> > But as soon as you select multiple columns, the result is not 0, but "no 
> > match found":
> >
> > select node_name, count(*) from backups where node_name='RAC_098-ORC'
> > and days(current_date) - days(backup_date) >= 3000 group by node_name 
> > ANR2034E SELECT: No match found using this criteria.
> > ANS8001I Return code 11.
> >
> > Thanks again for your help!
> >
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage & Backup
> >
> >
> >
> > -Original Message-
> > From: ADSM: Dist Stor Manager  On Behalf Of
> > Skylar Thompson
> > Sent: woensdag 8 april 2020 16:03
> > To: ADSM-L@VM.MARIST.EDU
> > Subject: Re: SQL query
> >
> > Ah, I think the problem is that comparing anything with NULL will be NULL 
> > (except comparing NULL with NULL, which is true). Try this:
> >
> > select b.node_name, count(*)
> > from backups a
> > right join nodes b on a.node_name=b.node_name and b.node_name like '%-ORC'
> > where
> > (a.backup_date is null or ((days(current_date) - 
> > days(a.backup_date) >= 30)))
> > and (a.state is null or a.state='ACTIVE_VERSION') group by
> > b.node_name
> >
> > Note that I also changed the "group by" and projection to use node_name 
> > from the nodes table since that's guaranteed to be set, rather than backups 
> > which would only be set for nodes with entries in the backups table.
> >
> > On Wed, Apr 08, 2020 at 08:26:42AM +, Loon, Eric van (ITOP NS) - KLM 
> > wrote:
> > > Hi Skylar,
> > >
> > > I tried your query, but it also returns just one node with a number > 0, 
> > > all other nodes (which have 0 files) are not listed.
> > > Thanks for your help!
> > >
> > > Kind regards,
> > > Eric van Loon
> > > Air France/KLM Storage & Backup
> > >
> > > -Original Message-
> > > From: ADSM: Dist Stor Manager  On Behalf Of
> > > Skylar Thompson
> > > Sent: dinsdag 7 april 2020 23:42
> > > To: ADSM-L@VM.MARIST.EDU
> > > Subject: Re: SQL query
> > >
> > > I think what you're looking for is an outer join:
> > >
&g

Re: SQL query

2020-04-09 Thread Skylar Thompson

I forgot that GROUP BY depended on having an entry in the result table.
Unfortunately I don't have a TSM server with a reasonably-sized backups
table to test on (production ones are 1+ billion entries), so I'm kind of
in thought experiment territory right now, but what if you did an outer
join from the nodes table against a sub-query on the backups table? That
would let you replace the count for nodes without an entry in the sub-query
with 0 with CASE:

select
a.node_name,
case when b.count is null
then 0
else b.count
end as count
from nodes a
left join (select node_name,count(*) from backups where
(days(current_date) - days(a.backup_date) >= 30)
and a.state='ACTIVE_VERSION'
group by node_name) b on a.node_name=b.node_name


On Thu, Apr 09, 2020 at 10:53:09AM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi Skylar,
>
> Sorry, but this one doesn't work either, it returns the same results as all 
> others. I don't think the NULL result is the issue here, it seems to be the 
> way the results are returned as soon as you select multiple columns. In the 
> following example ,when I select just one, the result is 0:
>
> select count(*) from backups where node_name='RAC_098-ORC' and 
> days(current_date) - days(backup_date) >= 3000
>
>   Unnamed[1]
> 
>0
>
> But as soon as you select multiple columns, the result is not 0, but "no 
> match found":
>
> select node_name, count(*) from backups where node_name='RAC_098-ORC' and 
> days(current_date) - days(backup_date) >= 3000 group by node_name
> ANR2034E SELECT: No match found using this criteria.
> ANS8001I Return code 11.
>
> Thanks again for your help!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
>
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: woensdag 8 april 2020 16:03
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: SQL query
>
> Ah, I think the problem is that comparing anything with NULL will be NULL 
> (except comparing NULL with NULL, which is true). Try this:
>
> select b.node_name, count(*)
> from backups a
> right join nodes b on a.node_name=b.node_name and b.node_name like '%-ORC'
> where
> (a.backup_date is null or ((days(current_date) - days(a.backup_date) 
> >= 30)))
> and (a.state is null or a.state='ACTIVE_VERSION') group by b.node_name
>
> Note that I also changed the "group by" and projection to use node_name from 
> the nodes table since that's guaranteed to be set, rather than backups which 
> would only be set for nodes with entries in the backups table.
>
> On Wed, Apr 08, 2020 at 08:26:42AM +, Loon, Eric van (ITOP NS) - KLM 
> wrote:
> > Hi Skylar,
> >
> > I tried your query, but it also returns just one node with a number > 0, 
> > all other nodes (which have 0 files) are not listed.
> > Thanks for your help!
> >
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage & Backup
> >
> > -Original Message-
> > From: ADSM: Dist Stor Manager  On Behalf Of
> > Skylar Thompson
> > Sent: dinsdag 7 april 2020 23:42
> > To: ADSM-L@VM.MARIST.EDU
> > Subject: Re: SQL query
> >
> > I think what you're looking for is an outer join:
> >
> > select a.node_name, count(*)
> > from backups a
> > right join nodes b on a.node_name=b.node_name and a.node_name like '%-ORC'
> > where
> > ((days(current_date) - days(backup_date) >= 30))
> > and a.state='ACTIVE_VERSION'
> > and b.contact like '%Oracle%'
> > group by a.node_name
> >
> > On Tue, Apr 07, 2020 at 09:08:58AM +, Loon, Eric van (ITOP NS) - KLM 
> > wrote:
> > > Hi guys,
> > >
> > > It must be something very easy, but I can't seem find the solution 
> > > myself... This is the query I use to list the total amount of Oracle 
> > > backup files older than 30 days:
> > >
> > > select count(*) as OBSOLETE_BACKUPS from backups a,nodes b where 
> > > ((days(current_date) - days(backup_date) >= 30)) and a.node_name like 
> > > '%-ORC' and a.state='ACTIVE_VERSION' and a.node_name=b.node_name and 
> > > b.contact like '%Oracle%'
> > >
> > > I also use this statement to list the files per node:
> > >
> > > select a.node_name, count(*) from backups a,nodes b where
> > > ((days(current_date) - days(backup_date) >= 30)) and a.node_name
> > > like '%-ORC' and a.state='ACTIVE_VERSION' and
> > > a.node_name=b.node_name and b.contact like '%Oracle%' group by
> > >

Re: SQL query

2020-04-08 Thread Skylar Thompson

Ah, I think the problem is that comparing anything with NULL will be NULL
(except comparing NULL with NULL, which is true). Try this:

select b.node_name, count(*)
from backups a
right join nodes b on a.node_name=b.node_name and b.node_name like '%-ORC'
where
(a.backup_date is null or ((days(current_date) - days(a.backup_date) >= 
30)))
and (a.state is null or a.state='ACTIVE_VERSION')
group by b.node_name

Note that I also changed the "group by" and projection to use node_name
from the nodes table since that's guaranteed to be set, rather than backups
which would only be set for nodes with entries in the backups table.

On Wed, Apr 08, 2020 at 08:26:42AM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi Skylar,
>
> I tried your query, but it also returns just one node with a number > 0, all 
> other nodes (which have 0 files) are not listed.
> Thanks for your help!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
>
> -Original Message-
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: dinsdag 7 april 2020 23:42
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: SQL query
>
> I think what you're looking for is an outer join:
>
> select a.node_name, count(*)
> from backups a
> right join nodes b on a.node_name=b.node_name and a.node_name like '%-ORC'
> where
>   ((days(current_date) - days(backup_date) >= 30))
>   and a.state='ACTIVE_VERSION'
>   and b.contact like '%Oracle%'
> group by a.node_name
>
> On Tue, Apr 07, 2020 at 09:08:58AM +, Loon, Eric van (ITOP NS) - KLM 
> wrote:
> > Hi guys,
> >
> > It must be something very easy, but I can't seem find the solution 
> > myself... This is the query I use to list the total amount of Oracle backup 
> > files older than 30 days:
> >
> > select count(*) as OBSOLETE_BACKUPS from backups a,nodes b where 
> > ((days(current_date) - days(backup_date) >= 30)) and a.node_name like 
> > '%-ORC' and a.state='ACTIVE_VERSION' and a.node_name=b.node_name and 
> > b.contact like '%Oracle%'
> >
> > I also use this statement to list the files per node:
> >
> > select a.node_name, count(*) from backups a,nodes b where
> > ((days(current_date) - days(backup_date) >= 30)) and a.node_name like
> > '%-ORC' and a.state='ACTIVE_VERSION' and a.node_name=b.node_name and
> > b.contact like '%Oracle%' group by a.node_name
> >
> > The statement works fine, but it only shows the nodes with an amount
> > of files > 0. I'm looking for the same command which shows all %-ORC nodes. 
> > So when the amount is 0, it should display the node_name along with the 
> > value 0. I can't figure out how to accomplish it. :( Thanks for any help in 
> > advance!
> >
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage & Backup
> > 
> > For information, services and offers, please visit our web site: 
> > http://www.klm.com. This e-mail and any attachment may contain confidential 
> > and privileged material intended for the addressee only. If you are not the 
> > addressee, you are notified that no part of the e-mail or any attachment 
> > may be disclosed, copied or distributed, and that any other action related 
> > to this e-mail or attachment is strictly prohibited, and may be unlawful. 
> > If you have received this e-mail by error, please notify the sender 
> > immediately by return e-mail, and delete this message.
> >
> > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> > employees shall not be liable for the incorrect or incomplete transmission 
> > of this e-mail or any attachments, nor responsible for any delay in receipt.
> > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
> > Dutch Airlines) is registered in Amstelveen, The Netherlands, with
> > registered number 33014286
> > 
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attac

Re: SQL query

2020-04-07 Thread Skylar Thompson

I think what you're looking for is an outer join:

select a.node_name, count(*)
from backups a
right join nodes b on a.node_name=b.node_name and a.node_name like '%-ORC'
where
((days(current_date) - days(backup_date) >= 30))
and a.state='ACTIVE_VERSION'
and b.contact like '%Oracle%'
group by a.node_name

On Tue, Apr 07, 2020 at 09:08:58AM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi guys,
>
> It must be something very easy, but I can't seem find the solution myself... 
> This is the query I use to list the total amount of Oracle backup files older 
> than 30 days:
>
> select count(*) as OBSOLETE_BACKUPS from backups a,nodes b where 
> ((days(current_date) - days(backup_date) >= 30)) and a.node_name like '%-ORC' 
> and a.state='ACTIVE_VERSION' and a.node_name=b.node_name and b.contact like 
> '%Oracle%'
>
> I also use this statement to list the files per node:
>
> select a.node_name, count(*) from backups a,nodes b where 
> ((days(current_date) - days(backup_date) >= 30)) and a.node_name like '%-ORC' 
> and a.state='ACTIVE_VERSION' and a.node_name=b.node_name and b.contact like 
> '%Oracle%' group by a.node_name
>
> The statement works fine, but it only shows the nodes with an amount of files 
> > 0. I'm looking for the same command which shows all %-ORC nodes. So when 
> the amount is 0, it should display the node_name along with the value 0. I 
> can't figure out how to accomplish it. :(
> Thanks for any help in advance!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Combining SQL queries

2020-03-26 Thread Skylar Thompson

Hi Eric,

You can combine these two queries by executing them as subqueries:

select
   *
from (
   select count(*) as "Number_Of_Failed" from events where 
scheduled_start>current_timestamp-24 hours and status='Failed'
) failed,
(
   select count(*) as "Number_Of_Missed" from events where 
scheduled_start>current_timestamp-24 hours and status='Missed'
) missed
;

Hope that helps!

On Thu, Mar 26, 2020 at 03:51:23PM +, Loon, Eric van (ITOP NS) - KLM wrote:
> Hi guys,
>
> I have two SQL queries:
>
> select count(*) as "Number_Of_Failed" from events where 
> scheduled_start>current_timestamp-24 hours and status='Failed'
> select count(*) as "Number_Of_Missed" from events where 
> scheduled_start>current_timestamp-24 hours and status='Missed'
>
> Is it somehow possible to combine them into one single query? I know I can 
> use and status='Failed' or status='Missed', but I would like to be able to 
> see how many are missed and how many are failed separately. My aim is to see 
> if I can use one single line (with multiple columns) to report this in SPOC.
> Thanks for any help in advance!
>
> Kind regards,
> Eric van Loon
> Air France/KLM Storage & Backup
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: further thoughts on moving tapes from a damaged library

2020-02-03 Thread Skylar Thompson

Yep, we've done this when migrating between STK (LTO) libraries. If the
barcode ranges between the two libraries don't overlap, and you're OK
sharing scratch media, you might not even need to define a second library.

On Mon, Feb 03, 2020 at 06:53:05PM +, Lee, Gary wrote:
> Could I delete the original library a definition from server 1, and 2, 
> redefine library a on server 2, share back to server one?
>
> Then manually check the tapes back into the new library a, preserving name, 
> private, and scratch categories, and redefine devclasses if necessary?

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: What is the process to wipe LTO

2019-12-16 Thread Skylar Thompson

TSM does not, and my understanding is that most regulatory environments
require some kind of physical destruction anyways as simply overwriting the
tape (even multiple times) is not sufficient to guarantee that the data are
unreadable.

Note also that TSM can manage hardware encryption with LTO drives (the
mechanism varies depending on library and generation of LTO) which might be
sufficient, though you have to take care that your database backups are
handled separately since it will be unencrypted to allow the encryption key
to be read in a disaster.

On Fri, Dec 13, 2019 at 05:31:41PM -0500, yoda woya wrote:
> Does TSM offer a utility to wipe LT0 tapes.  I would like to
> override whatever data is there.  Thanks in advance for your assistance.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Problems backup NFS mount

2019-11-13 Thread Skylar Thompson

You would need to add an exclude.dir statement to your include/exclude
list:

https://www.ibm.com/support/knowledgecenter/en/SSEQVQ_8.1.7/client/r_opt_exclude.html

On Fri, Nov 08, 2019 at 10:59:04AM -0500, yoda woya wrote:
> I am using the Domain ALL-NSF to backup this device but need to
> exclude /var/echo360/.snapshot/.  where/what would be the syntax of the
> exclude statement to accomplish this.  Thanks.
>
> On Wed, Nov 6, 2019 at 11:17 AM yoda woya  wrote:
>
> > Thanks for the lead.
> >
> > On Wed, Nov 6, 2019 at 10:30 AM Skylar Thompson  wrote:
> >
> >> You'll need to change your NFS mount options, or improve the
> >> network/storage so that you're not experiencing timeouts. Note that this
> >> is
> >> just speculation - there isn't enough information to know whether it's the
> >> case. You don't say what OS you're using, but NFS timeouts would generally
> >> show up somewhere in your system log. Regardless, it's unlikely to be a
> >> TSM
> >> problem so the solution will also lie elsewhere.
> >>
> >> On Wed, Nov 06, 2019 at 10:21:25AM -0500, yoda woya wrote:
> >> > Any idea on how to mitigate that and instruct TSM to proceed.  Thank
> >> you.
> >> >
> >> > 11/06/2019 00:32:26 --- SCHEDULEREC STATUS BEGIN
> >> > 11/06/2019 00:32:26 Total number of objects inspected:  174,098
> >> > 11/06/2019 00:32:26 Total number of objects backed up:  173,575
> >> > 11/06/2019 00:32:26 Total number of objects updated:  0
> >> > 11/06/2019 00:32:26 Total number of objects rebound:  0
> >> > 11/06/2019 00:32:26 Total number of objects deleted:  0
> >> > 11/06/2019 00:32:26 Total number of objects expired:139,994
> >> > 11/06/2019 00:32:26 Total number of objects failed: 505
> >> > 11/06/2019 00:32:26 Total number of objects encrypted:0
> >> > 11/06/2019 00:32:26 Total number of objects grew: 0
> >> > 11/06/2019 00:32:26 Total number of retries: 15
> >> > 11/06/2019 00:32:26 Total number of bytes inspected: 175.56 GB
> >> > 11/06/2019 00:32:26 Total number of bytes transferred:   175.37 GB
> >> > 11/06/2019 00:32:26 Data transfer time:6,123.26 sec
> >> > 11/06/2019 00:32:26 Network data transfer rate:   30,031.10
> >> KB/sec
> >> > 11/06/2019 00:32:26 Aggregate data transfer rate: 18,189.02
> >> KB/sec
> >> > 11/06/2019 00:32:26 Objects compressed by:0%
> >> > 11/06/2019 00:32:26 Total data reduction ratio:0.12%
> >> > 11/06/2019 00:32:26 Elapsed processing time:   02:48:29
> >> > 11/06/2019 00:32:26 --- SCHEDULEREC STATUS END
> >> > 11/06/2019 00:32:26 ANS4010E Error processing '/var/echo360': stale NFS
> >> > handle
> >> > 11/06/2019 00:32:26 --- SCHEDULEREC OBJECT END 2120 11/05/2019 21:20:00
> >> > 11/06/2019 00:32:26 ANS1512E Scheduled event '2120' failed.  Return
> >> code =
> >> > 12.
> >> > 11/06/2019 00:32:26 Sending results for scheduled event '2120'.
> >> > 11/06/2019 00:32:27 Results sent to server for scheduled event '2120'.
> >> >
> >> >
> >> > On Mon, Nov 4, 2019 at 10:17 AM Skylar Thompson  wrote:
> >> >
> >> > > Assuming that you're able to access the mount independently of
> >> backups, my
> >> > > best guess would be that you have a soft mount and are experiencing a
> >> > > timeout.
> >> > >
> >> > > On Sat, Nov 02, 2019 at 04:07:01AM -0400, yoda woya wrote:
> >> > > > Any reason why TSM would think that my NFS mount is stale:
> >> > > >
> >> > > > 11/02/2019 00:32:47 --- SCHEDULEREC STATUS BEGIN
> >> > > > 11/02/2019 00:32:47 Total number of objects inspected:  163,326
> >> > > > 11/02/2019 00:32:47 Total number of objects backed up:  162,913
> >> > > > 11/02/2019 00:32:47 Total number of objects updated:  0
> >> > > > 11/02/2019 00:32:47 Total number of objects rebound:  0
> >> > > > 11/02/2019 00:32:47 Total number of objects deleted:  0
> >> > > > 11/02/2019 00:32:47 Total number of objects expired:541,845
> >> > > > 11/02/2019 00:32:47 Total number of objects failed: 125
> >> > > > 11/02/20

Re: Problems backup NFS mount

2019-11-06 Thread Skylar Thompson

You'll need to change your NFS mount options, or improve the
network/storage so that you're not experiencing timeouts. Note that this is
just speculation - there isn't enough information to know whether it's the
case. You don't say what OS you're using, but NFS timeouts would generally
show up somewhere in your system log. Regardless, it's unlikely to be a TSM
problem so the solution will also lie elsewhere.

On Wed, Nov 06, 2019 at 10:21:25AM -0500, yoda woya wrote:
> Any idea on how to mitigate that and instruct TSM to proceed.  Thank you.
>
> 11/06/2019 00:32:26 --- SCHEDULEREC STATUS BEGIN
> 11/06/2019 00:32:26 Total number of objects inspected:  174,098
> 11/06/2019 00:32:26 Total number of objects backed up:  173,575
> 11/06/2019 00:32:26 Total number of objects updated:  0
> 11/06/2019 00:32:26 Total number of objects rebound:  0
> 11/06/2019 00:32:26 Total number of objects deleted:  0
> 11/06/2019 00:32:26 Total number of objects expired:139,994
> 11/06/2019 00:32:26 Total number of objects failed: 505
> 11/06/2019 00:32:26 Total number of objects encrypted:0
> 11/06/2019 00:32:26 Total number of objects grew: 0
> 11/06/2019 00:32:26 Total number of retries: 15
> 11/06/2019 00:32:26 Total number of bytes inspected: 175.56 GB
> 11/06/2019 00:32:26 Total number of bytes transferred:   175.37 GB
> 11/06/2019 00:32:26 Data transfer time:6,123.26 sec
> 11/06/2019 00:32:26 Network data transfer rate:   30,031.10 KB/sec
> 11/06/2019 00:32:26 Aggregate data transfer rate: 18,189.02 KB/sec
> 11/06/2019 00:32:26 Objects compressed by:0%
> 11/06/2019 00:32:26 Total data reduction ratio:0.12%
> 11/06/2019 00:32:26 Elapsed processing time:   02:48:29
> 11/06/2019 00:32:26 --- SCHEDULEREC STATUS END
> 11/06/2019 00:32:26 ANS4010E Error processing '/var/echo360': stale NFS
> handle
> 11/06/2019 00:32:26 --- SCHEDULEREC OBJECT END 2120 11/05/2019 21:20:00
> 11/06/2019 00:32:26 ANS1512E Scheduled event '2120' failed.  Return code =
> 12.
> 11/06/2019 00:32:26 Sending results for scheduled event '2120'.
> 11/06/2019 00:32:27 Results sent to server for scheduled event '2120'.
>
>
> On Mon, Nov 4, 2019 at 10:17 AM Skylar Thompson  wrote:
>
> > Assuming that you're able to access the mount independently of backups, my
> > best guess would be that you have a soft mount and are experiencing a
> > timeout.
> >
> > On Sat, Nov 02, 2019 at 04:07:01AM -0400, yoda woya wrote:
> > > Any reason why TSM would think that my NFS mount is stale:
> > >
> > > 11/02/2019 00:32:47 --- SCHEDULEREC STATUS BEGIN
> > > 11/02/2019 00:32:47 Total number of objects inspected:  163,326
> > > 11/02/2019 00:32:47 Total number of objects backed up:  162,913
> > > 11/02/2019 00:32:47 Total number of objects updated:  0
> > > 11/02/2019 00:32:47 Total number of objects rebound:  0
> > > 11/02/2019 00:32:47 Total number of objects deleted:  0
> > > 11/02/2019 00:32:47 Total number of objects expired:541,845
> > > 11/02/2019 00:32:47 Total number of objects failed: 125
> > > 11/02/2019 00:32:47 Total number of objects encrypted:0
> > > 11/02/2019 00:32:47 Total number of objects grew: 0
> > > 11/02/2019 00:32:47 Total number of retries:  6
> > > 11/02/2019 00:32:47 Total number of bytes inspected: 165.53 GB
> > > 11/02/2019 00:32:47 Total number of bytes transferred:   165.23 GB
> > > 11/02/2019 00:32:47 Data transfer time:6,321.49 sec
> > > 11/02/2019 00:32:47 Network data transfer rate:   27,408.19
> > KB/sec
> > > 11/02/2019 00:32:47 Aggregate data transfer rate: 16,174.62
> > KB/sec
> > > 11/02/2019 00:32:47 Objects compressed by:0%
> > > 11/02/2019 00:32:47 Total data reduction ratio:0.18%
> > > 11/02/2019 00:32:47 Elapsed processing time:   02:58:31
> > > 11/02/2019 00:32:47 --- SCHEDULEREC STATUS END
> > > 11/02/2019 00:32:47 ANS4010E Error processing '/var/echo360': stale NFS
> > > handle
> > > 11/02/2019 00:32:47 --- SCHEDULEREC OBJECT END 2120 11/01/2019 21:20:00
> > > 11/02/2019 00:32:47 ANS1512E Scheduled event '2120' failed.  Return code
> > =
> > > 12.
> > > 11/02/2019 00:32:47 Sending results for scheduled event '2120'.
> > > 11/02/2019 00:32:49 Results sent to server for scheduled event '2120'.
> > >
> &g

Re: Problems backup NFS mount

2019-11-04 Thread Skylar Thompson

Assuming that you're able to access the mount independently of backups, my
best guess would be that you have a soft mount and are experiencing a
timeout.

On Sat, Nov 02, 2019 at 04:07:01AM -0400, yoda woya wrote:
> Any reason why TSM would think that my NFS mount is stale:
>
> 11/02/2019 00:32:47 --- SCHEDULEREC STATUS BEGIN
> 11/02/2019 00:32:47 Total number of objects inspected:  163,326
> 11/02/2019 00:32:47 Total number of objects backed up:  162,913
> 11/02/2019 00:32:47 Total number of objects updated:  0
> 11/02/2019 00:32:47 Total number of objects rebound:  0
> 11/02/2019 00:32:47 Total number of objects deleted:  0
> 11/02/2019 00:32:47 Total number of objects expired:541,845
> 11/02/2019 00:32:47 Total number of objects failed: 125
> 11/02/2019 00:32:47 Total number of objects encrypted:0
> 11/02/2019 00:32:47 Total number of objects grew: 0
> 11/02/2019 00:32:47 Total number of retries:  6
> 11/02/2019 00:32:47 Total number of bytes inspected: 165.53 GB
> 11/02/2019 00:32:47 Total number of bytes transferred:   165.23 GB
> 11/02/2019 00:32:47 Data transfer time:6,321.49 sec
> 11/02/2019 00:32:47 Network data transfer rate:   27,408.19 KB/sec
> 11/02/2019 00:32:47 Aggregate data transfer rate: 16,174.62 KB/sec
> 11/02/2019 00:32:47 Objects compressed by:0%
> 11/02/2019 00:32:47 Total data reduction ratio:0.18%
> 11/02/2019 00:32:47 Elapsed processing time:   02:58:31
> 11/02/2019 00:32:47 --- SCHEDULEREC STATUS END
> 11/02/2019 00:32:47 ANS4010E Error processing '/var/echo360': stale NFS
> handle
> 11/02/2019 00:32:47 --- SCHEDULEREC OBJECT END 2120 11/01/2019 21:20:00
> 11/02/2019 00:32:47 ANS1512E Scheduled event '2120' failed.  Return code =
> 12.
> 11/02/2019 00:32:47 Sending results for scheduled event '2120'.
> 11/02/2019 00:32:49 Results sent to server for scheduled event '2120'.
>
> 11/02/2019 00:32:49 ANS1483I Schedule log pruning started.
> 11/02/2019 00:32:49 ANS1484I Schedule log pruning finished successfully.
> 11/02/2019 00:32:49 IBM Spectrum Protect Backup-Archive Client Version 8,
> Release 1, Level 6.0
> 11/02/2019 00:32:49 Querying server for next scheduled event.
> 11/02/2019 00:32:49 Node Name: ECHO
> 11/02/2019 00:32:50 Session established with server TSMN1: Linux/x86_64
> 11/02/2019 00:32:50   Server Version 8, Release 1, Level 8.000
> 11/02/2019 00:32:50   Data compression forced off by the server
> 11/02/2019 00:32:50   Server date/time: 11/02/2019 00:32:50  Last access:
> 11/01/2019 21:35:12

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: very large number of objects being deleted

2019-09-27 Thread Skylar Thompson

I would start with a QUERY OCCUPANCY for the node on the TSM server to
figure out which filespaces have lots of objects, and whether the BA client
was responsible for them, or some other client type. You could also try
looking at the client logs, if you have access to them.

On Fri, Sep 27, 2019 at 06:24:24PM +, Lee, Gary wrote:
> Is this normal?Tsm 7.1.7.1, running on redhat 6.9.
>
> Deleting a windows 2008r2 node's data using delete filespace.
> Killed the process after 5.5 hours ad 34 million ojects.
> Restarted this morning, has been running for 6 hours and says it has deleted 
> over 88 million objects.
>
> There is only 500 gB of on-site data for the client.
> My vere=14 verdel=14, rete=14, reto=30. Backups occur once per day, which 
> should allow point-in-time restores for up to two weeks.
>
> Where did all these objects come from, and will it ever get done?
> I have never seen client with over 120 million objects.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Moving Library Manager function to different ISP server

2019-09-06 Thread Skylar Thompson

Yep, that's been our experience as well. At least for LTO, I believe the
VOLSER is written unencrypted so both the library manager and clients can
read it, while the storage pool data are encrypted using the key in the
client database.

On Fri, Sep 06, 2019 at 10:35:15AM -0400, Zoltan Forray wrote:
> We need to discontinue one of our servers that is an LM and want to make
> sure there aren't any gotchas.
>
> The last time I did this many years ago, the process was fairly
> straightforward (checkout tapes, define library, paths, drives on target
> server, check-in tapes on target server, redo all definitions on other,
> Library Client servers). However, back then we weren't encrypting any tapes.
>
> Since the Encryption method is "Application Managed" (i.e. the ISP server
> that writes to the tape maintains the keys), there shouldn't be any issues
> - right?

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Vote for feature: Media lifecycle

2019-04-04 Thread Skylar Thompson

I voted for it as well. We can get some of this information from ACSLS but
it would be nice to have something that is not tied to Oracle/STK.

On Tue, Mar 26, 2019 at 10:21:12AM +0100, Hans Christian Riksheim wrote:
> Voted.
>
> A few years ago I made an RFE requesting the ability to do audit library
> without using half a day shutting down all the library clients but it was
> rejected. Hope this goes better.
>
> hc
>
> On Tue, Mar 26, 2019 at 9:28 AM Jansen, Jonas 
> wrote:
>
> > Hello,
> >
> > maybe some of you may require automated media lifecycle to. In case you
> > wish
> > this feature, vote for this feature request:
> >
> > https://www.ibm.com/developerworks/rfe/execute?use_case=viewRfe_ID=131259
> >
> > Best regards,
> > ---
> > Jonas Jansen
> >
> > IT Center
> > Gruppe: Server & Storage
> > Abteilung: Systeme & Betrieb
> > RWTH Aachen University
> > Seffenter Weg 23
> > 52074 Aachen
> > Tel: +49 241 80-28784
> > Fax: +49 241 80-22134
> > jan...@itc.rwth-aachen.de
> > www.itc.rwth-aachen.de
> >
> >
> >
> >

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: MSSQL retention for deleted databases

2019-03-12 Thread Skylar Thompson

A few ways to do it:

1. Run a full incremental backup on the node after the database is removed
2. Use the client-side EXPIRE command with a list of paths to mark inactive
3. If the entire node is going away, use the server-side DECOMMISSION NODE
command (take note of the caveats in the documentation though)

On Tue, Mar 12, 2019 at 04:08:58PM +0100, Hans Christian Riksheim wrote:
> Newbie question(only 18 years of experience with TSM) :
>
> How do I ensure that deleted databases get inactivated and expired
> according to the copygroup settings? As far as I know inactivation of
> copies >retonly occurs when the database is backed up. If the database is
> removed all active copies will stay active forever and it is now a manual
> task to inactivate them from the client.
>
> Regards,
>
> Hans Chr.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: dsm.sys question

2019-02-27 Thread Skylar Thompson

I don't know that this approach would work - Puppet would see that the file
differs from the deployed file, and would just overwrite it the next time
the agent runs. Puppet would need to manage dsm.sys completely[1] with
Rick's desired changes, or those options would have to be taken out of
dsm.sys and placed server-side.

[1] I think there might be a way to have Puppet only manage a block in a
file, but this would be pretty complicated and prone to error.

On Wed, Feb 27, 2019 at 11:04:43PM +, Harris, Steven wrote:
> Rick
>
> The m4 macro processor is a standard unix offering and can do anything from 
> simple includes and variable substitutions to lisp-like processing that will 
> boggle your mind. An m4 macro with some include files and a makefile with a 
> cron job to build your dsm.sys might do the job.
>
> Cheers
>
> Steve
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Skylar Thompson
> Sent: Thursday, 28 February 2019 6:10 AM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] dsm.sys question
>
> Hi Rick,
>
> I'm not aware of a mechanism that allows one to do that with dsmc/dsm.sys, 
> but Puppet does have the ability to include arbitrary lines in a file, either 
> via a template or directly in a rule definition.
>
> Another option would be to use server-side client option sets:
>
> https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.1/com.ibm.itsm.srv.doc/t_mgclinod_mkclioptsets.html
>
> These options mirror what can be set in dsm.sys, and can either be overridden 
> by the client, or enforced by the server.
>
> On Wed, Feb 27, 2019 at 06:58:30PM +, Rhodes, Richard L. wrote:
> > Hello,
> >
> > Our Unix team in implementing a management application named Puppet.
> > They are running into a problem using Puppet to setup/maintain the TSM
> > client dsm.sys files.  They create/maintain the dsm.sys as per a
> > template of some kind.  If you change a dsm.sys with a unique option,
> > it gets overwritten by the standard template when Puppet
> > refreshes/checks the file.  The inclexcl option pulls include/excludes
> > from a separate local file so this works fine for local specific needs.
> > But some systems need other settings or whole servername stanzas that
> > are unique.  I've looked through the BA client manual and see no way
> > to include arbitrary lines from some other file into dsm.sys.
> >
> > Q) Is there a way to source options from another file into the dsm.sys, 
> > kind of like the inclexcl option does?
> >
> >
> > Thanks
> >
> > Rick
> > --
> > 
> >
> > The information contained in this message is intended only for the personal 
> > and confidential use of the recipient(s) named above. If the reader of this 
> > message is not the intended recipient or an agent responsible for 
> > delivering it to the intended recipient, you are hereby notified that you 
> > have received this document in error and that any review, dissemination, 
> > distribution, or copying of this message is strictly prohibited. If you 
> > have received this communication in error, please notify us immediately, 
> > and delete the original message.
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
>
> This message and any attachment is confidential and may be privileged or 
> otherwise protected from disclosure. You should immediately delete the 
> message if you are not the intended recipient. If you have received this 
> email by mistake please delete it from your system; you should not copy the 
> message or disclose its content to anyone.
>
> This electronic communication may contain general financial product advice 
> but should not be relied upon or construed as a recommendation of any 
> financial product. The information has been prepared without taking into 
> account your objectives, financial situation or needs. You should consider 
> the Product Disclosure Statement relating to the financial product and 
> consult your financial adviser before making a decision about whether to 
> acquire, hold or dispose of a financial product.
>
> For further details on the financial product please go to http://www.bt.com.au
>
> Past performance is not a reliable indicator of future performance.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: dsm.sys question

2019-02-27 Thread Skylar Thompson

Hi Rick,

I'm not aware of a mechanism that allows one to do that with dsmc/dsm.sys,
but Puppet does have the ability to include arbitrary lines in a file,
either via a template or directly in a rule definition.

Another option would be to use server-side client option sets:

https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.1/com.ibm.itsm.srv.doc/t_mgclinod_mkclioptsets.html

These options mirror what can be set in dsm.sys, and can either be
overridden by the client, or enforced by the server.

On Wed, Feb 27, 2019 at 06:58:30PM +, Rhodes, Richard L. wrote:
> Hello,
>
> Our Unix team in implementing a management application named Puppet.
> They are running into a problem using Puppet to setup/maintain the
> TSM client dsm.sys files.  They create/maintain the dsm.sys as
> per a template of some kind.  If you change a dsm.sys with a unique
> option, it gets overwritten by the standard template when Puppet
> refreshes/checks the file.  The inclexcl option pulls include/excludes
> from a separate local file so this works fine for local specific needs.
> But some systems need other settings or whole servername stanzas that
> are unique.  I've looked through the BA client manual and see no way to
> include arbitrary lines from some other file into dsm.sys.
>
> Q) Is there a way to source options from another file into the dsm.sys, kind 
> of like the inclexcl option does?
>
>
> Thanks
>
> Rick
> --
>
> The information contained in this message is intended only for the personal 
> and confidential use of the recipient(s) named above. If the reader of this 
> message is not the intended recipient or an agent responsible for delivering 
> it to the intended recipient, you are hereby notified that you have received 
> this document in error and that any review, dissemination, distribution, or 
> copying of this message is strictly prohibited. If you have received this 
> communication in error, please notify us immediately, and delete the original 
> message.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: nfstimeout on server ISILON storage

2018-09-05 Thread Skylar Thompson

Yep, you're right, I misread that (shouldn't send email pre-coffee).

Are the timeouts repeatable enough that you can get a packet capture in
there before and while they're happening?

On Wed, Sep 05, 2018 at 07:09:09PM -0400, Zoltan Forray wrote:
> Skylar,
>
> I sent your comment about UDP vs TDP to my OS tech (beyond my ken) - got
> this feedback:
>
> I assume what they are talking about is this:
>
> hhisilonnfs23.rams.adp.vcu.edu:/ifs/NFS/TSM on /tsmnfs type nfs
> (rw,relatime,sync,vers=3,rsize=131072,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.19.12,mountvers=3,mountport=300,
> *mountproto=udp*,local_lock=none,addr=192.168.19.12)
>
> Looks like this is the default setting (also on all the other servers to
> initiate a conversation with the NFS server). However, if you read the
> documentation on this option it goes into detail about how this option
> differs from proto (which is also defined):
>
> https://access.redhat.com/solutions/183583
>
> "mountproto differs from proto as it defines what protocol (TCP or UDP) the
> client will use to initiate the connection and conduct the mount and
> umountoperations.
> This differs from the proto option which sets the protocol that the initial
> connection *and* the actual transportation will use."
>
> The proto option (set to TCP in the mount) appears to be determining how
> the actual connection and transport of data is conducted.
>
> When running a tcpdump on Earth I see NFS TCP traffic running over the 23
> VLAN (and the 22 VLAN on other TSM servers) and no UDP packets to speak of.
>
> On Wed, Sep 5, 2018 at 10:25 AM Skylar Thompson  wrote:
>
> > It looks like you're using UDP as a transport - have you tried switching to
> > TCP? Especially with large NFS payload sizes, you're going to get lots of
> > fragmentation with UDP's 512-byte packet limit.
> >
> > On Wed, Sep 05, 2018 at 09:03:25AM -0400, Zoltan Forray wrote:
> > > A pair of 10G links bonded - CISCO switches.
> > >
> > > On Tue, Sep 4, 2018 at 7:54 PM Skylar Thompson  wrote:
> > >
> > > > Quick question - what's the data link protocol (Ethernet, IB, etc.) and
> > > > link rate
> > > > that you're using?
> > > >
> > > > On Tue, Sep 04, 2018 at 02:05:33PM -0400, Zoltan Forray wrote:
> > > > > We are still fighting issues with ISILON storage. Our current issue
> > is
> > > > with
> > > > > NFS timeouts for the storage a server is using.  We see message like
> > > > these
> > > > > in the server /var/log
> > > > >
> > > > > Sep  4 13:21:49 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > > not responding, still trying
> > > > > Sep  4 13:21:49 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > > not responding, still trying
> > > > > Sep  4 13:21:49 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > > not responding, still trying
> > > > > Sep  4 13:21:49 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > > not responding, still trying
> > > > > Sep  4 13:22:14 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > > not responding, still trying
> > > > > Sep  4 13:22:15 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > > not responding, still trying
> > > > > Sep  4 13:22:16 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > OK
> > > > > Sep  4 13:22:16 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > OK
> > > > > Sep  4 13:22:16 earth kernel: nfs: server
> > hhisilonnfs23.rams.adp.vcu.edu
> > > > OK
> > > > >
> > > > > OS folks say the NFS mount is setup as IBM recommends in various
> > > > documents.
> > > > > So they asked us to implement the nfstimeout option from this
> > document (
> > > > >
> > > >
> > https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.0/com.ibm.itsm.client.doc/r_opt_nfstimeout.html
> > > > ).
> > > > > Yes I realize it is primarily for a client backup of an NFS mount,
> > but
> > > > the
> > > > > statement:
> > > > >
> > > > > Supported Clients This option is for all UNIX and Linux clients. *The
> > > > > server can also define this option*.
> > &g

Re: nfstimeout on server ISILON storage

2018-09-05 Thread Skylar Thompson

It looks like you're using UDP as a transport - have you tried switching to
TCP? Especially with large NFS payload sizes, you're going to get lots of
fragmentation with UDP's 512-byte packet limit.

On Wed, Sep 05, 2018 at 09:03:25AM -0400, Zoltan Forray wrote:
> A pair of 10G links bonded - CISCO switches.
>
> On Tue, Sep 4, 2018 at 7:54 PM Skylar Thompson  wrote:
>
> > Quick question - what's the data link protocol (Ethernet, IB, etc.) and
> > link rate
> > that you're using?
> >
> > On Tue, Sep 04, 2018 at 02:05:33PM -0400, Zoltan Forray wrote:
> > > We are still fighting issues with ISILON storage. Our current issue is
> > with
> > > NFS timeouts for the storage a server is using.  We see message like
> > these
> > > in the server /var/log
> > >
> > > Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > > not responding, still trying
> > > Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > > not responding, still trying
> > > Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > > not responding, still trying
> > > Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > > not responding, still trying
> > > Sep  4 13:22:14 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > > not responding, still trying
> > > Sep  4 13:22:15 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > > not responding, still trying
> > > Sep  4 13:22:16 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > OK
> > > Sep  4 13:22:16 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > OK
> > > Sep  4 13:22:16 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> > OK
> > >
> > > OS folks say the NFS mount is setup as IBM recommends in various
> > documents.
> > > So they asked us to implement the nfstimeout option from this document (
> > >
> > https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.0/com.ibm.itsm.client.doc/r_opt_nfstimeout.html
> > ).
> > > Yes I realize it is primarily for a client backup of an NFS mount, but
> > the
> > > statement:
> > >
> > > Supported Clients This option is for all UNIX and Linux clients. *The
> > > server can also define this option*.
> > >
> > > throws us - kind-of implying I can use this from the server perspective?
> > > But I can't find any documentation to support using it from the server.
> > >
> > > For you Linux guru's - this is what the mount says:
> > >
> > > hhisilonnfs23.rams.adp.vcu.edu:/ifs/NFS/TSM on /tsmnfs type nfs
> > >
> > (rw,relatime,sync,vers=3,rsize=131072,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.19.12,mountvers=3,mountport=300,mountproto=udp,local_lock=none,addr=192.168.19.12)
> > >
> > > Any thoughts?  Suggestion?   Are we simply expecting too much from NFS?
> > >
> > > My OS person also asks why ISP is so slow to write to NFS?  When they
> > did a
> > > test copy of a large file to the NFS mount, they were getting upwards of
> > 8G/s
> > > vs 1.5-3G/s when TSM/ISP writes to it (via EMC monitoring tools).
> > >
> > > --
> > > *Zoltan Forray*
> > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > Xymon Monitor Administrator
> > > VMware Administrator
> > > Virginia Commonwealth University
> > > UCC/Office of Technology Services
> > > www.ucc.vcu.edu
> > > zfor...@vcu.edu - 804-828-4807
> > > Don't be a phishing victim - VCU and other reputable organizations will
> > > never use email to request that you reply with your password, social
> > > security number or confidential personal information. For more details
> > > visit http://phishing.vcu.edu/
> >
> > --
> > -- Skylar Thompson (skyl...@u.washington.edu)
> > -- Genome Sciences Department, System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- University of Washington School of Medicine
> >
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: nfstimeout on server ISILON storage

2018-09-04 Thread Skylar Thompson

Quick question - what's the data link protocol (Ethernet, IB, etc.) and link 
rate
that you're using?

On Tue, Sep 04, 2018 at 02:05:33PM -0400, Zoltan Forray wrote:
> We are still fighting issues with ISILON storage. Our current issue is with
> NFS timeouts for the storage a server is using.  We see message like these
> in the server /var/log
>
> Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> not responding, still trying
> Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> not responding, still trying
> Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> not responding, still trying
> Sep  4 13:21:49 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> not responding, still trying
> Sep  4 13:22:14 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> not responding, still trying
> Sep  4 13:22:15 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu
> not responding, still trying
> Sep  4 13:22:16 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu OK
> Sep  4 13:22:16 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu OK
> Sep  4 13:22:16 earth kernel: nfs: server hhisilonnfs23.rams.adp.vcu.edu OK
>
> OS folks say the NFS mount is setup as IBM recommends in various documents.
> So they asked us to implement the nfstimeout option from this document (
> https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.0/com.ibm.itsm.client.doc/r_opt_nfstimeout.html).
> Yes I realize it is primarily for a client backup of an NFS mount, but the
> statement:
>
> Supported Clients This option is for all UNIX and Linux clients. *The
> server can also define this option*.
>
> throws us - kind-of implying I can use this from the server perspective?
> But I can't find any documentation to support using it from the server.
>
> For you Linux guru's - this is what the mount says:
>
> hhisilonnfs23.rams.adp.vcu.edu:/ifs/NFS/TSM on /tsmnfs type nfs
> (rw,relatime,sync,vers=3,rsize=131072,wsize=524288,namlen=255,hard,proto=tcp,timeo=600,retrans=2,sec=sys,mountaddr=192.168.19.12,mountvers=3,mountport=300,mountproto=udp,local_lock=none,addr=192.168.19.12)
>
> Any thoughts?  Suggestion?   Are we simply expecting too much from NFS?
>
> My OS person also asks why ISP is so slow to write to NFS?  When they did a
> test copy of a large file to the NFS mount, they were getting upwards of 8G/s
> vs 1.5-3G/s when TSM/ISP writes to it (via EMC monitoring tools).
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: MariaDB backups using modern MariaDB methods and high performance restores

2018-09-04 Thread Skylar Thompson

We do this with PostgreSQL - take a snapshot and mount it with a 
preschedulecmd, run an incremental
backup on the snapshot, and then unmount and destroy it with a
postschedulecmd. The complication for Stefan would be that he would have
to restore the entire snapshot in order to have a usable database, which
might take too long for his restore objective. PostgreSQL also supports
continuous backups of the WAL (journal)[1] which allow for more
fine-grained point-in-time restores, but I'm not sure if MySQL/MariaDB have
an equivalent solution.

[1]
https://www.postgresql.org/docs/current/static/continuous-archiving.html

On Tue, Sep 04, 2018 at 05:29:18PM +0200, Remco Post wrote:
> Just a thought. This is a linux server, right? So you have linux LVM. I think 
> it should be possible to make a consistent snapshot using MariaDB and LVM. 
> Then you can backup the snapshot and in case of a disaster restore that. Now, 
> I’ve never attempted this, and I don’t know how to do it, but it seems to be 
> the only viable acceptable solution.
> 
> > Op 4 sep. 2018, om 09:49 heeft Stefan Folkerts  
> > het volgende geschreven:
> > 
> > Hi all,
> > 
> > I'm currently looking for the best backup option for a large and extremely
> > transaction-heavy MariaDB database environment. I'm talking about up to
> > 100.000.000 transactions a year (payment industry).
> > 
> > It needs to connect to Spectrum Protect to store it's database data, it is
> > acceptable if this is a two stage backup solution but not for restores due
> > to the duration of a two stage restore.
> > 
> > We have looked at one option but that used the traditional mysqldump
> > methods that have proven to be unusable for this customer because the
> > restore is up to 8 times slower than the backup and during the backup all
> > transactions are stored to be committed later, this is an issue with this
> > many transactions.
> > 
> > zmanda seems to offer newer backup mechanics for MariaDB, i'm wondering if
> > anybody used this with Spectrum Protect that can share some experiences
> > with this solution.
> > Also, any other ideas for solutions that are officially supporting Spectrum
> > Protect would be great.
> > 
> > Thanks in advance,
> >   Stefan
> 
> -- 
> 
>  Met vriendelijke groeten/Kind Regards,
> 
> Remco Post
> r.p...@plcs.nl
> +31 6 248 21 622

-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: CentOS vs RHEL

2018-08-23 Thread Skylar Thompson

We've been OK with IBM's best-effort support for CentOS for both client and
server, but we don't have a lot of complexity beyond our size (no TDP, no
VTL, no replication, no de-dupe, etc.).

I would definitely be interested in seeing real support for CentOS, though,
and up-voted the RFE Del passed along (thanks, Del!).

On Thu, Aug 23, 2018 at 12:09:53PM -0400, Zoltan Forray wrote:
> This is mostly targeted at IBM folks, but we are also looking for feedback
> from others who gone through this.
>
> Since RHEL licensing costs have increased almost three fold over the last 5
> years, we are going to push moving to CentOS. There are no licensing fees
> for CentOS and our current RHEL licensing does not include support. CentOS
> is functionally compatible or binary compatible with RHEL.
>
> So how, if at all, will this effect IBM support in the ISP server arena?
> As far as I can tell, IBM only officially supports AIX, SUSE, RHEL, Debian,
> HP-UX, SOLARIS versions of *NIX for a server.  Then of course there is the
> lin_tape driver compatibility/support.
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Looking for suggestions to deal with large backups not completing in 24-hours

2018-07-19 Thread Skylar Thompson

Sadly, no. I made a feature request for this years ago (back when Isilon
was Isilon) but it didn't go anywhere. At this point, our days of running
Isilon storage are numbered, and we'll be investing in DDN/GPFS for the
forseeable future, so I haven't really had leverage to push Dell/EMC/Isilon
on the matter.

On Thu, Jul 19, 2018 at 11:31:06PM +, Harris, Steven wrote:
> Is there no journaling/logging service on these Isilions that could be used 
> to maintain a list of changed files and hand-roll a 
> dsmc-selective-with-file-list process similar to what GPFS uses? 
> 
> Cheers
> 
> Steve
> 
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Richard Cowen
> Sent: Friday, 20 July 2018 6:15 AM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups not 
> completing in 24-hours
> 
> Canary! I like it!
> Richard
> 
> -Original Message-----
> From: ADSM: Dist Stor Manager  On Behalf Of Skylar 
> Thompson
> Sent: Thursday, July 19, 2018 10:37 AM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] Looking for suggestions to deal with large backups not 
> completing in 24-hours
> 
> There's a couple ways we've gotten around this problem:
> 
> 1. For NFS backups, we don't let TSM do partial incremental backups, even if 
> we have the filesystem split up. Instead, we mount sub-directories of the 
> filesystem root on our proxy nodes. This has the double advantage of letting 
> us break up the filesystem into multiple TSM filespaces (giving us 
> directory-level backup status reporting, and parallelism in TSM when we have 
> COLLOCG=FILESPACE), and also parallelism at the NFS level when there are 
> multiple NFS targets we can talk to (as in the case with Isilon).
> 
> 2. For GPFS backups, in some cases we can setup independent filesets and let 
> mmbackup process each as a separate filesystem, though we have some instances 
> where the end users want an entire GPFS filesystem to have one inode space so 
> they can do atomic moves as renames. In either case, though, mmbackup does 
> its own "incremental" backups with filelists passed to "dsmc selective", 
> which don't update the last-backup time on the TSM filespace. Our workaround 
> has been to run mmbackup via a preschedule command, and have the actual TSM 
> incremental backup be of an empty directory (I call them canary directories 
> in our documentation) that's set as a virtual mountpoint. dsmc will only run 
> the backup portion of its scheduled task if the preschedule command succeeds, 
> so if mmbackup fails, the canary never gets backed up, and will raise an 
> alert.
> 
> On Wed, Jul 18, 2018 at 03:07:16PM +0200, Lars Henningsen wrote:
> > @All
> > 
> > possibly the biggest issue when backing up massive file systems in parallel 
> > with multiple dsmc processes is expiration. Once you back up a directory 
> > with ???subdir no???, a no longer existing directory object on that level 
> > is expired properly and becomes inactive. However everything underneath 
> > that remains active and doesn???t expire (ever) unless you run a ???full??? 
> > incremental on the level above (with ???subdir yes???) - and that kind of 
> > defeats the purpose of parallelisation. Other pitfalls include avoiding 
> > swapping, keeping log files consistent (dsmc doesn???t do thread awareness 
> > when logging - it assumes being alone), handling the local dedup cache, 
> > updating backup timestamps for a file space on the server, distributing 
> > load evenly across multiple nodes on a scale-out filer, backing up from 
> > snapshots, chunking file systems up into even parts automatically so you 
> > don???t end up with lots of small jobs and one big one, dynamically 
> > distributing load across multiple ???proxies??? if one isn???t enough, 
> > handling exceptions, handling directories with characters you can???t parse 
> > to dsmc via the command line, consolidating results in a single, 
> > comprehensible overview similar to the summary of a regular incremental, 
> > being able to do it all in reverse for a massively parallel restore??? the 
> > list is quite long.
> > 
> > We developed MAGS (as mentioned by Del) to cope with all that - and more. I 
> > can only recommend trying it out for free.
> > 
> > Regards
> > 
> > Lars Henningsen
> > General Storage
> 
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
> 
> This message and any attachment is confide

Re: Looking for suggestions to deal with large backups not completing in 24-hours

2018-07-19 Thread Skylar Thompson

There's a couple ways we've gotten around this problem:

1. For NFS backups, we don't let TSM do partial incremental backups, even
if we have the filesystem split up. Instead, we mount sub-directories of the
filesystem root on our proxy nodes. This has the double advantage of
letting us break up the filesystem into multiple TSM filespaces (giving us
directory-level backup status reporting, and parallelism in TSM when we
have COLLOCG=FILESPACE), and also parallelism at the NFS level when there
are multiple NFS targets we can talk to (as in the case with Isilon).

2. For GPFS backups, in some cases we can setup independent filesets and
let mmbackup process each as a separate filesystem, though we have some
instances where the end users want an entire GPFS filesystem to have one
inode space so they can do atomic moves as renames. In either case,
though, mmbackup does its own "incremental" backups with filelists passed
to "dsmc selective", which don't update the last-backup time on the TSM
filespace. Our workaround has been to run mmbackup via a preschedule
command, and have the actual TSM incremental backup be of an empty
directory (I call them canary directories in our documentation) that's set
as a virtual mountpoint. dsmc will only run the backup portion of its
scheduled task if the preschedule command succeeds, so if mmbackup fails,
the canary never gets backed up, and will raise an alert.

On Wed, Jul 18, 2018 at 03:07:16PM +0200, Lars Henningsen wrote:
> @All
> 
> possibly the biggest issue when backing up massive file systems in parallel 
> with multiple dsmc processes is expiration. Once you back up a directory with 
> ???subdir no???, a no longer existing directory object on that level is 
> expired properly and becomes inactive. However everything underneath that 
> remains active and doesn???t expire (ever) unless you run a ???full??? 
> incremental on the level above (with ???subdir yes???) - and that kind of 
> defeats the purpose of parallelisation. Other pitfalls include avoiding 
> swapping, keeping log files consistent (dsmc doesn???t do thread awareness 
> when logging - it assumes being alone), handling the local dedup cache, 
> updating backup timestamps for a file space on the server, distributing load 
> evenly across multiple nodes on a scale-out filer, backing up from snapshots, 
> chunking file systems up into even parts automatically so you don???t end up 
> with lots of small jobs and one big one, dynamically distributing load across 
> multiple ???proxies??? if one isn???t enough, handling exceptions, handling 
> directories with characters you can???t parse to dsmc via the command line, 
> consolidating results in a single, comprehensible overview similar to the 
> summary of a regular incremental, being able to do it all in reverse for a 
> massively parallel restore??? the list is quite long.
> 
> We developed MAGS (as mentioned by Del) to cope with all that - and more. I 
> can only recommend trying it out for free.
> 
> Regards
> 
> Lars Henningsen
> General Storage

-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Looking for suggestions to deal with large backups not completing in 24-hours: the GWDG solution briefly explained

2018-07-17 Thread Skylar Thompson

One thing to be aware of with partial incremental backups is the danger of
backing up data multiple times if the mount points are nested. For
instance,

/mnt/backup/some-dir
/mnt/backup/some-dir/another-dir

Under normal operation, a node with DOMAIN set to "/mnt/backup/some-dir
/mnt/backup/some-dir/another-dir" will backup the contents of 
/mnt/backup/some-dir/another-dir
as a separate filespace, *and also* will backup another-dir as a
subdirectory of the /mnt/backup/some-dir filespace. We reported this as a
bug, and IBM pointed us at this flag that can be passed as a scheduler
option to prevent this:

-TESTFLAG=VMPUNDERNFSENABLED

On Tue, Jul 17, 2018 at 04:12:17PM +0200, Bjrn Nachtwey wrote:
> Hi Zoltan,
> 
> OK, i will translate my text as there are some more approaches discussed :-)
> 
> breaking up the filesystems in several nodes will work as long as the nodes
> are of suffiecient size.
> 
> I'm not sure if a PROXY node will solve the problem, because each "member
> node" will backup the whole mountpoint. You will need to do partial
> incremental backups. I expect you will do this based on folders, do you?
> So, some questions:
> 1) how will you distribute the folders to the nodes?
> 2) how will you ensure new folders are processed by one of your "member
> nodes"? On our filers many folders are created and deleted, sometimes a
> whole bunch every day. So for me, it was no option to maintain the option
> file manually. The approach from my script / "MAGS" does this somehow
> "automatically".
> 3) what happens if the folders grew not evenly and all the big ones are
> backed up by one of your nodes? (OK you can change the distribution or even
> add another node)
> 4) Are you going to map each backupnode to different nodes of the isilon
> cluster to distribute the traffic / workload for the isilon nodes?
> 
> best
> Bjørn

-- 
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Looking for suggestions to deal with large backups not completing in 24-hours

2018-07-05 Thread Skylar Thompson

We've implemented file count quotas in addition to our existing byte
quotas to try to avoid this situation. You can improve some things
(metadata on SSDs, maybe get an accelerator node if Isilon still offers
those) but the fact is that metadata is expensive in terms of CPU (both
client and server) and disk.

We chose 1 million objects/TB of allocated disk space. We sort of compete
with a storage system offered by our central IT organization, and picked a
limit higher than what they would provide.

To be honest, though, we're retiring our Isilon systems because the
performance/scalability/cost ratios just aren't as great as they used to
be. Our new storage is GPFS and mmbackup works much better with huge number
of files, though it's still not great. In particular, the filelist
generation is based around UNIX sort which is definitely a memory pig,
though it can be split across multiple systems so can scale out pretty
well.

On Thu, Jul 05, 2018 at 02:52:27PM -0400, Zoltan Forray wrote:
> As I have mentioned in the past, we have gone through large migrations to
> DFS based storage on EMC ISILON hardware.  As you may recall, we backup
> these DFS mounts (about 90 at last count) using multiple Windows servers
> that run multiple ISP nodes (about 30-each) and they access each DFS
> mount/filesystem via -object=\\rams.adp.vcu.edu\departmentname.
>
> This has lead to lots of performance issue with backups and some
> departments are now complain that their backups are running into
> multiple-days in some cases.
>
> One such case in a department with 2-nodes with over 30-million objects for
> each node.  In the past, their backups were able to finish quicker since
> they were accessed via dedicated servers and were able to use Journaling to
> reduce the scan times.  Unless things have changed, I believe Journling is
> not an option due to how the files are accessed.
>
> FWIW, average backups are usually <50k files and <200GB once it finished
> scanning.
>
> Also, the idea of HSM/SPACEMANAGEMENT has reared its ugly head since many
> of these objects haven't been accessed in many years old. But as I
> understand it, that won't work either given our current configuration.
>
> Given the current DFS configuration (previously CIFS), what can we do to
> improve backup performance?
>
> So, any-and-all ideas are up for discussion.  There is even discussion on
> replacing ISP/TSM due to these issues/limitations.
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: ISILON storage/FILE DEVCLASS performance issues

2018-05-14 Thread Skylar Thompson

No, this is an NFS server setting, but I'm not sure that it's tunable on
Isilon. On Linux and Solaris, it defaults to some very low value, which is
fine for sequential I/O but really slows down random I/O. On Linux,
RPCNFSDCOUNT can be tuned from the default of 8 to 512, which is fine as
long as the NFS server is running on dedicated hardware.

On Mon, May 14, 2018 at 10:06:31AM -0400, Zoltan Forray wrote:
> We did some quick research and "NFS thread" controls don't apply in our
> situation and can't be set. Or are you referring to the mountlimit value
> for the devclass?
>
> On Mon, May 14, 2018 at 9:31 AM, Skylar Thompson <skyl...@uw.edu> wrote:
>
> > This sounds pretty good to me. If you can, I would boost your NFS thread
> > count past the number of CPUs that you have, since a lot of NFS is just
> > waiting for the disks to respond. You still need a thread for that, but it
> > won't consume much CPU.
> >
> > On Mon, May 14, 2018 at 08:27:27AM -0400, Zoltan Forray wrote:
> > > Very interesting.  This supports my idea on how I want to layout the
> > > new/replacement server.  The old server is only 16-threads and certainly
> > > could not handle dedup (we can't afford any appliances like DD) since it
> > is
> > > bucking under the current backups traffic. The new server has 72-threads
> > as
> > > well as 100TB internal disk.  My idea is to use the fast internal 100TB
> > > disk for inbound traffic and deduping and use the 200TB NFS/ISILON for
> > > nextstoragepool (trying to get completely off 3592-tape storage for
> > onsite
> > > backups). Plus the DB will be on SSD.
> > >
> > > Any thoughts on this configuration?
> > >
> > > On Sun, May 13, 2018 at 8:38 PM, Harris, Steven <
> > > steven.har...@btfinancialgroup.com> wrote:
> > >
> > > > Zoltan
> > > >
> > > > I have a similar issue TSM 7.1.1.300 AIX -> Data Domain.  Have dual
> > 10Gb
> > > > links, but can only get ~4000 writes/sec and 120MB/sec throughput. AIX
> > only
> > > > supports NFS3, and as others have pointed out in this forum recently,
> > the
> > > > stack does not have a good reputation.
> > > >
> > > > I'm finding that the heavy NFS load has other knock on effects, e.g.
> > > > TSMManager keeps reporting the instance offline when it's very busy as
> > it
> > > > gets a network error on some of its regular queries, but these work ok
> > when
> > > > load is light.  Also getting a lot of Severed/reconnected sessions.
> > > > CPU/IO/Paging are not a problem.
> > > >
> > > > Cheers
> > > >
> > > > Steve
> > > >
> > > > Steven Harris
> > > > TSM Admin/Consultant
> > > > Canberra Australia
> > > >
> > > > -Original Message-
> > > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf
> > Of
> > > > Zoltan Forray
> > > > Sent: Saturday, 12 May 2018 1:39 AM
> > > > To: ADSM-L@VM.MARIST.EDU
> > > > Subject: [ADSM-L] ISILON storage/FILE DEVCLASS performance issues
> > > >
> > > > Folks,
> > > >
> > > > ISP 7.1.7.300 on RHEL 6   10G connectivity
> > > >
> > > > We need some guidance on trying to figure out why ISP/TSM write
> > perform to
> > > > ISILON storage (via FILE DEVCLASS) is so horrible.
> > > >
> > > > We recently attached 200TB of ISILON storage to this server so we could
> > > > empty the 36TB of onboard disk drives to move this server to new
> > hardware.
> > > >
> > > > However, per my OS and SAN folks, we are only seeing 1Gbs level of data
> > > > movement from the ISP server.  Doing a regular file copy to this same
> > > > storage peaks at 10Gbs speeds.
> > > >
> > > > So what, if anything, are we doing wrong when it comes to configuring
> > the
> > > > storage for ISP to use?  Are there some secret
> > controls/settings/options to
> > > > tell it to use the storage at max-speeds?
> > > >
> > > > We tried changing the Est/Max capacity thinking larger files would
> > reduce
> > > > the overhead of allocating new pieces constantly.  Changed the Mount
> > Limit
> > > > to a bigger number.  Nothing has helped.
> > > >
> > > > Only thing uses the storage right now is migrations from the original
> > disk
> >

Re: ISILON storage/FILE DEVCLASS performance issues

2018-05-14 Thread Skylar Thompson

r dispose of a financial product.
> >
> > For further details on the financial product please go to
> > http://www.bt.com.au
> >
> > Past performance is not a reliable indicator of future performance.
> >
>
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: ISILON storage/FILE DEVCLASS performance issues

2018-05-14 Thread Skylar Thompson

Do you see consistent NFS throughput, or is it bursty? We've never used
Isilon as storage for TSM, but we have had problems generally with too-low
NFS timeouts causing NFS to back off for too long. You can also see this
problem manifest itself with NFS timeout messages in the kernel log.

On Fri, May 11, 2018 at 11:39:08AM -0400, Zoltan Forray wrote:
> Folks,
>
> ISP 7.1.7.300 on RHEL 6   10G connectivity
>
> We need some guidance on trying to figure out why ISP/TSM write perform to
> ISILON storage (via FILE DEVCLASS) is so horrible.
>
> We recently attached 200TB of ISILON storage to this server so we could
> empty the 36TB of onboard disk drives to move this server to new hardware.
>
> However, per my OS and SAN folks, we are only seeing 1Gbs level of data
> movement from the ISP server.  Doing a regular file copy to this same
> storage peaks at 10Gbs speeds.
>
> So what, if anything, are we doing wrong when it comes to configuring the
> storage for ISP to use?  Are there some secret controls/settings/options to
> tell it to use the storage at max-speeds?
>
> We tried changing the Est/Max capacity thinking larger files would reduce
> the overhead of allocating new pieces constantly.  Changed the Mount Limit
> to a bigger number.  Nothing has helped.
>
> Only thing uses the storage right now is migrations from the original disk
> stgpool.
>
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Improving Replication performance

2018-04-26 Thread Skylar Thompson

Are you CPU or disk-bound on the source or target servers? Even if you have
lots of CPUs, replication might be running on a single thread and just using
one CPU.

On Thu, Apr 26, 2018 at 02:46:24PM -0400, Zoltan Forray wrote:
> As we get deeper into Replication and my boss wants to use it more and more
> as an offsite recovery platform.
>
> As we try to reach "best practices" of replicating everything, we are
> finding this desire to be difficult if not impossible to achieve due to the
> resource demands.
>
> Total we want to eventually replicate is around 700TB from 5-source servers
> to 1-target server which is dedicated to replication.
>
> So the big question is, can this be done?
>
> We recently rebuilt the offsite target server to as big as we could afford
> ($38K).  It has 256GB of RAM.  64-threads of CPU. Storage is primarily
> 500TB of ISILON/NFS. Connectivity is via quad 10G (2-for IP traffic from
> source servers and 2-for ISILON/NFS).
>
> Yet we can only replicate around 3TB daily when we backup around 7TB.
>
> Looking for suggestions/thoughts/experiences?
>
> All boxes are RHEL Linux and 7.1.7.300
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: How to backup ISILON storage

2018-02-08 Thread Skylar Thompson

 Content preview:  We have a few dozen Windows systems, but nothing complex 
enough
to require more than simple POSIX permissions. Most of those Windows systems
are instrument systems feeding an analysis pipeline and all connect with
   a single user account. The regular user accounts just belong to standard UNIX
groups so don't really require ACLs to manage. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1518102266
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-BRTS-Status: 1
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 1342
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.47714
Rule breakdown below
 pts rule name  description
 -- 
--

We have a few dozen Windows systems, but nothing complex enough to require
more than simple POSIX permissions. Most of those Windows systems are
instrument systems feeding an analysis pipeline and all connect with a
single user account. The regular user accounts just belong to standard UNIX
groups so don't really require ACLs to manage.

Most of the systems using the storage are Linux cluster nodes running the
analysis pipeline over NFS.

On Thu, Feb 08, 2018 at 09:44:37AM -0500, Zoltan Forray wrote:
> So you don't have any Windows filesystems on the ISILON? You are a purely
> Linux/Unix shop?
>
> On Thu, Feb 8, 2018 at 9:41 AM, Skylar Thompson <skyl...@u.washington.edu>
> wrote:
>
> >  Content preview:  We briefly looked into doing replication, but trying to
> > convince
> > our user base (scientists) that they should get several petabytes of
> > disk
> > that they couldn't directly use would have been a non-starter. At the
> > time
> > we also "only" had 10Gbps Internet connection, and sync'ing upwards of
> > 50TB/day
> >     would have consumed a substantial part of that uplink. :) [...]

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: How to backup ISILON storage

2018-02-08 Thread Skylar Thompson

 Content preview:  We briefly looked into doing replication, but trying to 
convince
our user base (scientists) that they should get several petabytes of disk
that they couldn't directly use would have been a non-starter. At the time
we also "only" had 10Gbps Internet connection, and sync'ing upwards of 
50TB/day
would have consumed a substantial part of that uplink. :) [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1518100866
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 947
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.47712
Rule breakdown below
 pts rule name  description
 -- 
--

We briefly looked into doing replication, but trying to convince our user
base (scientists) that they should get several petabytes of disk that they
couldn't directly use would have been a non-starter. At the time we also
"only" had 10Gbps Internet connection, and sync'ing upwards of 50TB/day
would have consumed a substantial part of that uplink. :)

On Thu, Feb 08, 2018 at 11:38:30AM +, Abbott, Joseph wrote:
> I agree with Remco 100%.
> If you can stay away from NDMP.
> We have a large Isilon environment which we backup with TSM/NDMP. It run very 
> long, is absolutely horrific for restores.
> We are making the switch over to Isilon snapshots and replication both native 
> to the Isilon. These solutions outperform TSM/NDMP tenfold.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: How to backup ISILON storage

2018-02-08 Thread Skylar Thompson

 Content preview:  We have a pool of 3 1U servers with 10GbE connectivity that
mount /ifs (and various subdirectories) over NFS. Each node has a set of
   schedules that has a subset of the mount points added with -domain 
statements.
If a node fails, we can move those schedules easily over to another node
   while it's being repaired. We actually use this same technique for some 
legacy
Hitachi/BlueARC storage, and it's worked well for that as well. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
  0.0 IP_LINK_PLUS   URI: Dotted-decimal IP address followed by CGI
  0.0 NORMAL_HTTP_TO_IP  URI: URI host has a public dotted-decimal IPv4
 address
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1518100638
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 6650
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.80
X-Barracuda-Spam-Status: No, SCORE=0.80 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=BSF_SC7_SA015c, IP_LINK_PLUS, 
NORMAL_HTTP_TO_IP
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.47712
Rule breakdown below
 pts rule name  description
 -- 
--
0.00 NORMAL_HTTP_TO_IP  Uses a dotted-decimal IP address in URL
0.00 IP_LINK_PLUS   URI: Dotted-decimal IP address followed by 
CGI
0.80 BSF_SC7_SA015c Custom Rule SA015c

We have a pool of 3 1U servers with 10GbE connectivity that mount /ifs (and
various subdirectories) over NFS. Each node has a set of schedules that
has a subset of the mount points added with -domain statements. If a node
fails, we can move those schedules easily over to another node while it's
being repaired. We actually use this same technique for some legacy
Hitachi/BlueARC storage, and it's worked well for that as well.

We don't use ACLs or anything beyond POSIX ownership and permissions, so
don't have to worry about that complexity. I think life would be much more
complicated if that were a requirement, though I would still try to find a
way to avoid NDMP.

On Wed, Feb 07, 2018 at 10:07:21PM -0500, Zoltan Forray wrote:
> Interesting.  As I said, we have no NDMP experience and wasn't aware of the
> vendor specific process.
>
> As for your technique, can you elaborate some more?   Where is the ISILON
> NFS mounted?  To the TSM/ISP server?  How do you preserve file rights?
> When our SAN guy pursued this (NFS) direction, an EMC forum discussion said
> it would not work since "NFS TSM backup would only backup the POSIX
> permissions and not the NTFS permissions" and since the ISILON is primarily
> accessed as DFS, the file attributes/rights is critical!
>
> On Wed, Feb 7, 2018 at 4:30 PM, Skylar Thompson <skyl...@u.washington.edu>
> wrote:
>
> >  Content preview:  I have stayed away from NDMP because it seems that it
> > locks
> > you into a particular vendor - if you use Isilon NDMP for backups,
> > then you
> > have to use Isilon NDMP for the restore. In a major disaster, I would
> > be
> >worried about the hassle of procuring compatible hardware/software to
> > do the
> > restore. We instead divide our Isilon storage up into separate NFS
> > mountpoints/TSM
> > filespaces and then point the client schedules at them with
> > "-domain='/ifs/dir1
> > /ifs/dir2'". We backup a 2PB OneFS filesystem in this manner, with
> > ~200 million
> > active files. [...]
> >
> >  Content analysis details:   (0.6 points, 5.0 required)
> >
> >   pts rule name  description
> >   -- --
> > 
> >   0.7 SPF_NEUTRALSPF: sender does not match SPF record
> > (neutral)
> >  -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
> >  domain
> > X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
> > X-Barracuda-Start-Time: 1518039059
> > X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
> > X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
> > X-Virus-Scanned: by bsmtpd at marist.edu
> > X-Barracuda-Scan-Msg-Size: 2463
> > X-Barracuda-BRTS-Status

Re: How to backup ISILON storage

2018-02-07 Thread Skylar Thompson

 Content preview:  I have stayed away from NDMP because it seems that it locks
you into a particular vendor - if you use Isilon NDMP for backups, then you
have to use Isilon NDMP for the restore. In a major disaster, I would be
   worried about the hassle of procuring compatible hardware/software to do the
restore. We instead divide our Isilon storage up into separate NFS 
mountpoints/TSM
filespaces and then point the client schedules at them with 
"-domain='/ifs/dir1
/ifs/dir2'". We backup a 2PB OneFS filesystem in this manner, with ~200 
million
active files. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1518039059
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 2463
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.47680
Rule breakdown below
 pts rule name  description
 -- 
--

I have stayed away from NDMP because it seems that it locks you into a
particular vendor - if you use Isilon NDMP for backups, then you have to
use Isilon NDMP for the restore. In a major disaster, I would be worried
about the hassle of procuring compatible hardware/software to do the
restore. We instead divide our Isilon storage up into separate NFS
mountpoints/TSM filespaces and then point the client schedules at them with
"-domain='/ifs/dir1 /ifs/dir2'". We backup a 2PB OneFS filesystem in this
manner, with ~200 million active files.

We actually are moving away from Isilon for cost reasons though, and moving
towards GPFS. mmbackup removes a lot of the workload division complexity,
though adds other complexity at the same time. That said, it just invokes
dsmc behind the scenes, which means that we can restore our Isilon backups
to GPFS, and vice versa.

On Wed, Feb 07, 2018 at 03:26:02PM -0500, Zoltan Forray wrote:
> As you recall, we have been trying to figure out an alternative method to
> backing up DFS mounted ISILON storage since the current method of 80+
> separate nodes accessed via the Web interface of the BA client is going
> away.  Plus the backups are taking soo long, we have to determine a
> better way.
>
> So, doing some digging, one solution that seems to be touted is using
> NDMP.
>
> We have absolutely zero experience with NDMP  and are looking for some
> guidance / cookbook / real-world experiences on how we would use NDMP to
> backup ISILON storage (>400TB and hundreds of millions of files) and make
> it accessible so someone from a help-desk like environment could handle
> file-level restores!
>
> Or if NDMP is the wrong direction, please tell us so.
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Spectre & Meltdown patching in relation to Spectrum Protect

2018-01-23 Thread Skylar Thompson

 Content preview:  Hi Stefan, My understanding is that Meltdown and Spectre took
advantage of having the kernel in the same address space as each user-space
process. Previously this was done to reduce the number of cache and TLB 
misses
around system calls, but at least for Linux, the mitigation involves 
removing
the kernel from each process's address space. This means that every system
call potentially involves cache/TLB invalidation, so applications that are
heavy on system calls like TSM are likely to see more of a performance 
impact
than entirely CPU-bound applications. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1516717828
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 2094
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.47174
Rule breakdown below
 pts rule name  description
 -- 
--

Hi Stefan,

My understanding is that Meltdown and Spectre took advantage of having the
kernel in the same address space as each user-space process. Previously
this was done to reduce the number of cache and TLB misses around system
calls, but at least for Linux, the mitigation involves removing the kernel
from each process's address space. This means that every system call
potentially involves cache/TLB invalidation, so applications that are
heavy on system calls like TSM are likely to see more of a performance
impact than entirely CPU-bound applications.

That said, we've applied the RHEL6 patches and haven't had trouble meeting
our backup windows. We did buy TSM servers with more CPU than we thought we
would need, which might help given that TSM is multi-threaded --- each
thread might run more slowly but at least we have other CPUs that can run
other threads.

We're almost an entirely Linux x86_64 shop, so I don't know the impact on other
platforms.

On Tue, Jan 23, 2018 at 08:29:17AM +0100, Stefan Folkerts wrote:
> Hi,
>
> Has anybody seen any information from IBM in relation to Spectre & Meltdown
> patching for Spectrum Protect servers?
> We have a customer who has found that systems that do lot's of small IO's
> that performance can drop 50% on intel systems, these were seen with
> synthetic benchmarks, not actual load.
>
> So this was certainly not a Spectrum Protect load but I was thinking that
> Spectrum Protect might fall into the category that can experience some sort
> of performance drop when patched for Spectre and Meltdown.
>
> Does anybody know what is the impact of these patches on Spectrum Protect
> systems?
>
> Will the blueprints need to be adjusted for instance or is there some
> percentage of lower performance we should expect after patching and does it
> vary per platform?
>
> Regards,
>Stefan

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Repeated ANR8341I "end-of-volume" messages for some LTO7 volumes

2018-01-09 Thread Skylar Thompson

 Content preview:  Indeed, that's a good sanity check. I just checked and both
of our LTO7 pools are below their MAXSCRATCH setting, and we have scratch
tapes available in that library. I'm also not seeing any messages in the
   activity or mount logs indicating a failure to mount scratch tapes. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1515508459
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 3278
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.46703
Rule breakdown below
 pts rule name  description
 -- 
--

Indeed, that's a good sanity check. I just checked and both of our LTO7
pools are below their MAXSCRATCH setting, and we have scratch tapes
available in that library. I'm also not seeing any messages in the activity
or mount logs indicating a failure to mount scratch tapes.

Interestingly, the percent utilized for these two volumes was at 40% which,
with a device class format of ULTRIUM7C, would be right at the expected
point where a volume with uncompressible data should be full. I wonder if
there's some logic failure within TSM where it assumes >0% compression for
the last file.

In any event, TSM finally marked these two volumes as full after a few
hundred attempts. I would be sort of curious to know what the last file was
on each of them...

On Tue, Jan 09, 2018 at 02:21:50PM +, Schofield, Neil (Contractor - Storage 
& Middleware,  Backup & Restore) wrote:
> Skylar
>
> It's probably worth ruling out the obvious stuff first.
>
> If the end-of-volume is reached then a new scratch tape will be required. If 
> for some reason a new scratch tape cannot be mounted (eg there are no scratch 
> left) to complete the writing of the spanned file/object then the store 
> operation will fail and the previous end-of-data point on the tape will be 
> used as the starting point the next time the tape is mounted. I've seen many 
> occasions where a  paucity of scratch tapes has resulted in the end-of-tape 
> being reached repeatedly on the handful of remaining 'filling' volumes.
>
> In the cases where the end-of-volume has been reached, has a scratch volume 
> been successfully mounted subsequently to complete the operation?
>
> Regards
> Neil
>
>
> Neil Schofield
> IBM Spectrum Protect SME
> Backup & Recovery | Storage & Middleware | Platform Technologies | 
> Infrastructure Technology Services | Group CIO & IT Change
> LLOYDS BANKING GROUP
> 
>
>
>
> Lloyds Banking Group plc. Registered Office: The Mound, Edinburgh EH1 1YZ. 
> Registered in Scotland no. SC95000. Telephone: 0131 225 4555.
>
> Lloyds Bank plc. Registered Office: 25 Gresham Street, London EC2V 7HN. 
> Registered in England and Wales no. 2065. Telephone 0207626 1500.
>
> Bank of Scotland plc. Registered Office: The Mound, Edinburgh EH1 1YZ. 
> Registered in Scotland no. SC327000. Telephone: 03457 801 801.
>
> Lloyds Bank plc, Bank of Scotland plc are authorised by the Prudential 
> Regulation Authority and regulated by the Financial Conduct Authority and 
> Prudential Regulation Authority.
>
> Halifax is a division of Bank of Scotland plc.
>
> HBOS plc. Registered Office: The Mound, Edinburgh EH1 1YZ. Registered in 
> Scotland no. SC218813.
>
> This e-mail (including any attachments) is private and confidential and may 
> contain privileged material. If you have received this e-mail in error, 
> please notify the sender and delete it (including any attachments) 
> immediately. You must not copy, distribute, disclose or use any of the 
> information in it or any attachments. Telephone calls may be monitored or 
> recorded.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Repeated ANR8341I "end-of-volume" messages for some LTO7 volumes

2018-01-08 Thread Skylar Thompson

 Content preview:  Hi ADSM-L, We've been having a problem where TSM logs an 
ANR8341I
message ("End-of-volume reached for LTO volume nn") for a volume, but
it never actually transitions the volume from filling to full. This means
TSM will dismount the volume, but then try to remount it elsewhere. 
Currently
we're seeing this problem on two volumes, and it doesn't seem to be tied
   to any subset of our drives. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1515433530
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 1534
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.46670
Rule breakdown below
 pts rule name  description
 -- 
--

Hi ADSM-L,

We've been having a problem where TSM logs an ANR8341I message
("End-of-volume reached for LTO volume nn") for a volume, but it never
actually transitions the volume from filling to full. This means TSM will
dismount the volume, but then try to remount it elsewhere. Currently we're
seeing this problem on two volumes, and it doesn't seem to be tied to any
subset of our drives.

We had this problem previously in November, and after opening a PMR, TSM
support suggested that it was a hardware issue of some kind (drive or
library). I've updated all of our drives to the latest firmware that we can
get from Oracle (we're running SL3000s), which is G9Q2. This seemed to
solve it for a little bit, but now we're seeing it again. I'm not seeing
any obvious errors in the activity log, ACSLS, or the library logs. The
only things I can really see are the ANR8341I messages, and the times
mounted counter go up (we have some volumes that have been mounted well
over a thousand times in less than a year because of this).

We're running TSM v7.1.8.0 on RHEL6 x86_64, with a SL3000 (ACSLS) and IBM
LTO7 drives all running firmware G9Q2.

Before I open another PMR, has anyone else seen this problem? I suspect the
next step will be server tracing, which is likely to be a big performance
hit.

Thanks!

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Should I upgrade to 7.1.8.x ??? (on the client end only)

2018-01-04 Thread Skylar Thompson

nly)
> >
> > Test! Test! Test! Search this forum for previous posts about this.
> > There are a bunch of gotchas. Perhaps one of the most severe is what
> > happens to administrator IDS. Create some dummy admin IDS to use in
> > testing, because you can permanently disable your own admin ID if
> > you're not careful. We also know there will be library sharing gotchas.
> >
> > We're actually going to do the backup servers first - after thorough
> > testing. We think we can minimize the risk to things like admin IDS if
> > we upgrade the servers with NO clients yet on 7.1.8. I think that
> > having 7.1.8 clients around will greatly complicate the process of
> > upgrading the servers, especially if any of those 7.1.8 clients are
> > the desktop workstations used by you and your coworkers. It's possible
> > that when you do eventually upgrade your servers to 7.1.8, you'll have
> > to backtrack to each client and manually install new SSL keys, on all
> > client systems, all at once. I hope that cat-herding nightmare can be
> > avoided by upgrading servers first, which will then manage key
> > distribution among clients more gracefully, as they upgrade to 7.1.8
> > one at a time. If I'm wrong about any of this, please chime in.
> >
> > This thing has a big effect. Careful testing is necessary.
> >
> > Roger Deschner
> > University of Illinois at Chicago
> > "I have not lost my mind - it is backed up on tape somewhere."
> > 
> > From: Skylar Thompson <skyl...@u.washington.edu>
> > Sent: Tuesday, January 2, 2018 16:19
> > Subject: Re: Should I upgrade to 7.1.8.x ??? (on the client end only)
> >
> >  Content preview:  I believe the incompatibility arises if you set
> > SESSIONSECURITY
> > to STRICT for your nodes. The default is TRANSITIONAL so you
> > should be fine;
> > IIRC the only communication problems we had when upgrading our
> servers to
> > v7.1.8 was with library sharing. [...]
> >
> >  Content analysis details:   (0.6 points, 5.0 required)
> >
> >   pts rule name  description
> >   --
> > --
> >   0.7 SPF_NEUTRALSPF: sender does not match SPF record
> (neutral)
> >  -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover
> relay
> >  domain
> > X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
> > X-Barracuda-Start-Time: 1514931575
> > X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
> > X-Barracuda-URL: https://urldefense.proofpoint.com/v2/url?
> > u=https-3A__148.100.49.
> > 28-3A443_cgi-2Dmod_mark.cgi=DwIFAg=jf_iaSHvJObTbx-
> >
> siA1ZOg=0hq2JX5c3TEZNriHEs7Zf7HrkY2fNtONOrEOM8Txvk8=529NKbiDtCmhOp63H3nZmM0Pnv-
> > V1fHyDWeSXJ-s-1I=wL7qg-bC6229Rs0MHKXxo50WnAcsl_tyXg8N0DW_oQA=
> > X-Virus-Scanned: by bsmtpd at marist.edu
> > X-Barracuda-Scan-Msg-Size: 3241
> > X-Barracuda-BRTS-Status: 1
> > X-Barracuda-Spam-Score: 0.00
> > X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of
> > TAG_LEVEL=3.5 QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
> > X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.46484
> > Rule breakdown below
> >  pts rule name  description
> >  --
> > --
> >
> > I believe the incompatibility arises if you set SESSIONSECURITY to
> > STRICT for your nodes. The default is TRANSITIONAL so you should be
> > fine; IIRC the only communication problems we had when upgrading our
> > servers to v7.1.8 was with library sharing.
> >
> > That said, v7.1.8 was a huge change so I would test it if possible
> first.
> >
> > On Tue, Jan 02, 2018 at 05:12:44PM -0500, Tom Alverson wrote:
> > > Thanks for that link, I am more worried about any "gotcha's" caused
> > > by
>
> > > upgrading the client to 7.1.8 or 8.1.2 before the storage servers get
> > > upgraded (and start using the new authentication).   What I had not
> > > realized until I saw the chart is that the new clients are NOT
> > > backward compatible with old storage servers (which doesn't really
> > > affect me since we have those all at 7.1.7.2 now).
> > >
> > >
> > > *IBM SPECTRUM PROTECT CLIENT SUPPORT*
> > >
> > > includes the Backup-Archive, API, UNIX HSM, and Web clients that are
> > > compatible with, and currently supported with, IB

Re: Should I upgrade to 7.1.8.x ??? (on the client end only)

2018-01-02 Thread Skylar Thompson

 Content preview:  I believe the incompatibility arises if you set 
SESSIONSECURITY
to STRICT for your nodes. The default is TRANSITIONAL so you should be fine;
IIRC the only communication problems we had when upgrading our servers to
v7.1.8 was with library sharing. [...]

 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1514931575
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-Scan-Msg-Size: 3241
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.46484
Rule breakdown below
 pts rule name  description
 -- 
--

I believe the incompatibility arises if you set SESSIONSECURITY to STRICT
for your nodes. The default is TRANSITIONAL so you should be fine; IIRC the only
communication problems we had when upgrading our servers to v7.1.8 was with
library sharing.

That said, v7.1.8 was a huge change so I would test it if possible first.

On Tue, Jan 02, 2018 at 05:12:44PM -0500, Tom Alverson wrote:
> Thanks for that link, I am more worried about any "gotcha's" caused by
> upgrading the client to 7.1.8 or 8.1.2 before the storage servers get
> upgraded (and start using the new authentication).   What I had not
> realized until I saw the chart is that the new clients are NOT backward
> compatible with old storage servers (which doesn't really affect me since
> we have those all at 7.1.7.2 now).
>
>
> *IBM SPECTRUM PROTECT CLIENT SUPPORT*
>
> includes the Backup-Archive, API, UNIX HSM, and Web clients
> that are compatible with, and currently supported with,
> IBM Spectrum Protect Servers and Storage Agents.
> *IBM Spectrum Protect*
> *Client Version*
> *Supported IBM Spectrum Protect*
> *Server and Storage Agent Versions*
> 8.1.2
> 8.1, 7.1
> 8.1.0
> 8.1, 7.1, 6.3.x 1
> 7.1.8
> 8.1, 7.1
> 7.1
> 8.1, 7.1, 6.3.x 1
> 6.4 1
> 8.1, 7.1, 6.3.x 1
> 6.3 1, 2
> 8.1, 7.1, 6.3.x 1
>
>
>
>
>
>
>
>
>
>
> On Tue, Jan 2, 2018 at 4:42 PM, Skylar Thompson <skyl...@u.washington.edu>
> wrote:
>
> > There's pretty wide version compatibility between clients and servers; we
> > didn't go v7 server-side until pretty recently but have been running the v7
> > client for a while. IBM has a matrix published here:
> >
> > http://www-01.ibm.com/support/docview.wss?uid=swg21053218
> >
> > For basic backups and restores I think you can deviate even more, but
> > obviously you won't get support.
> >
> > On Tue, Jan 02, 2018 at 03:14:24PM -0500, Tom Alverson wrote:
> > > Our TSM storage servers were all upgraded last year to 7.1.7.2 (before
> > this
> > > new security update came out).   Now I am wondering if I should start
> > using
> > > the updated client or not?   If the servers stay at 7.1.7.2 for now is
> > > there any harm in using the newer client?  I would have to use 7.1.8.0 on
> > > anything older than 2012.  I saw some email traffic earlier that once you
> > > use the new authentication mode on a node you can't go back?  But it
> > seems
> > > that would not be possible until our storage servers get upgraded.
> > >
> > > Is there any downside in my case (where the storage servers are still at
> > > 7.1.7.2) of using the latest client versions in the interim??  Our
> > current
> > > standard client versions now are 7.1.6.4 for 2008 and older, and 8.1.0.0
> > > (yes the horrible buggy one) on newer servers.
> > >
> > > Tom
> >
> > --
> > -- Skylar Thompson (skyl...@u.washington.edu)
> > -- Genome Sciences Department, System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- University of Washington School of Medicine
> >

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Should I upgrade to 7.1.8.x ??? (on the client end only)

2018-01-02 Thread Skylar Thompson

There's pretty wide version compatibility between clients and servers; we
didn't go v7 server-side until pretty recently but have been running the v7
client for a while. IBM has a matrix published here:

http://www-01.ibm.com/support/docview.wss?uid=swg21053218

For basic backups and restores I think you can deviate even more, but
obviously you won't get support.

On Tue, Jan 02, 2018 at 03:14:24PM -0500, Tom Alverson wrote:
> Our TSM storage servers were all upgraded last year to 7.1.7.2 (before this
> new security update came out).   Now I am wondering if I should start using
> the updated client or not?   If the servers stay at 7.1.7.2 for now is
> there any harm in using the newer client?  I would have to use 7.1.8.0 on
> anything older than 2012.  I saw some email traffic earlier that once you
> use the new authentication mode on a node you can't go back?  But it seems
> that would not be possible until our storage servers get upgraded.
>
> Is there any downside in my case (where the storage servers are still at
> 7.1.7.2) of using the latest client versions in the interim??  Our current
> standard client versions now are 7.1.6.4 for 2008 and older, and 8.1.0.0
> (yes the horrible buggy one) on newer servers.
>
> Tom

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Fwd: IBM Spectrum protect 8.1.3 doesn't update "Days since last access" anybody run into this

2017-11-20 Thread Skylar Thompson

 Content preview:  Thanks for passing that along, Andy. Turns out we've been
   hit by this since our update to 7.1.8 and didn't even notice. On Mon, Nov
   20, 2017 at 04:43:22PM -0500, Andrew Raibeck wrote: > Hi, I think APAR 
IT22897
describes the issue you are seeing: > > 
www.ibm.com/support/docview.wss?uid=swg1IT22897
> > Andy > > > > Andrew Raibeck | IBM Spectrum Protect Level 3 | 
stor...@us.ibm.com
> > IBM Tivoli Storage Manager links: > Product support: > 
https://www.ibm.com/support/entry/portal/product/tivoli/tivoli_storage_manager
> > Online documentation: > 
http://www.ibm.com/support/knowledgecenter/SSGSG7/landing/welcome_ssgsg7.html
> > Product Wiki: > 
https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager
> > "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 2017-11-20
   > 16:22:25: > > > From: Genet Begashaw <gbega...@umd.edu> > > To: 
ADSM-L@VM.MARIST.EDU
> > Date: 2017-11-20 16:24 > > Subject: Fwd: IBM Spectrum protect 8.1.3 
doesn't
update "Days since > > last access" anybody run into this > > Sent by: 
"ADSM:
Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> > > > > Forwarded message > > 
From:
Genet Begashaw <gbega...@umd.edu> > > Date: Mon, Nov 20, 2017 at 8:41 AM
   > > Subject: IBM Spectrum protect 8.1.3 doesn't update "Days since last >
   access" > > anybody run into this > > To: lists...@vm.marist.edu > > [...]


 Content analysis details:   (0.6 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 T_RP_MATCHES_RCVD  Envelope sender domain matches handover relay
 domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1511214552
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 1693
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.45046
Rule breakdown below
 pts rule name  description
 -- 
--

Thanks for passing that along, Andy. Turns out we've been hit by this since
our update to 7.1.8 and didn't even notice.

On Mon, Nov 20, 2017 at 04:43:22PM -0500, Andrew Raibeck wrote:
> Hi, I think APAR IT22897 describes the issue you are seeing:
>
> www.ibm.com/support/docview.wss?uid=swg1IT22897
>
> Andy
>
> 
>
> Andrew Raibeck | IBM Spectrum Protect Level 3 | stor...@us.ibm.com
>
> IBM Tivoli Storage Manager links:
> Product support:
> https://www.ibm.com/support/entry/portal/product/tivoli/tivoli_storage_manager
>
> Online documentation:
> http://www.ibm.com/support/knowledgecenter/SSGSG7/landing/welcome_ssgsg7.html
>
> Product Wiki:
> https://www.ibm.com/developerworks/community/wikis/home/wiki/Tivoli%20Storage%20Manager
>
> "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU> wrote on 2017-11-20
> 16:22:25:
>
> > From: Genet Begashaw <gbega...@umd.edu>
> > To: ADSM-L@VM.MARIST.EDU
> > Date: 2017-11-20 16:24
> > Subject: Fwd: IBM Spectrum protect 8.1.3 doesn't update "Days since
> > last access" anybody run into this
> > Sent by: "ADSM: Dist Stor Manager" <ADSM-L@VM.MARIST.EDU>
> >
> > -- Forwarded message --
> > From: Genet Begashaw <gbega...@umd.edu>
> > Date: Mon, Nov 20, 2017 at 8:41 AM
> > Subject: IBM Spectrum protect 8.1.3 doesn't update "Days since last
> access"
> > anybody run into this
> > To: lists...@vm.marist.edu
> >

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Tape Drive Error

2017-11-16 Thread Skylar Thompson

Hi Vincent,

I don't see a tape drive in the message (just the library), so I would
suspect the library itself. You might find these documents useful:

https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.1/com.ibm.itsm.msgs.server.doc/ioerrorcodes.html
https://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.1/com.ibm.itsm.msgs.server.doc/ioerrorcodes_ascdesc.html#ioerrorcodes_ascdesc

Key=04 == "hardware error"
ASC=15 ASCQ=01 == "Mechanical positioning error"

The library itself might have more information in its logs.

On Thu, Nov 16, 2017 at 07:25:09PM +, D'antonio, Vincent E. wrote:
> I am running TSM Server Version 6, Release 2, Level 3.0 on AIX 5.3 TL12 SP02 
> - Yes I know all are unsupported now...
>
> I am seeing an error when the library tries to mount a tape:
>
> ANR8943E Hardware or media error on library TAPELIB
>   (OP=6C03, CC=-1, KEY=04, ASC=15, ASCQ=01,
>   
> SENSE=70.00.04.00.00.00.00.0A.00.00.00.00.15.01.00.00.00-
>   .00., Description=An undetermined error has 
> occurred).
>   Refer to the  documentation on I/O error code
>   descriptions.
>
> I have replaced the drive, even removed drives and path and recreated same 
> error.  Anyone have any ideas how to correct or what to look for?
>
> Thanks
> Vince
>
> Vincent D'Antonio
> Aerotek - Leidos

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Question on Collocation

2017-11-09 Thread Skylar Thompson

 Content preview:  Hi Jennifer, While there is a COLLOCGROUP option to MOVE 
NODEDATA,
it sounds like you just want to move one node's data. Any kind of data 
movement
operation will try to respect the collocation setting you provide at the
   node level, assuming you have the storage pool set to collocate by group as
well. Something like this would move the node's data to fresh tapes in the
same storage pool: [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1510243908
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 3707
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.44667
Rule breakdown below
 pts rule name  description
 -- 
--

Hi Jennifer,

While there is a COLLOCGROUP option to MOVE NODEDATA, it sounds like you
just want to move one node's data.  Any kind of data movement operation
will try to respect the collocation setting you provide at the node level,
assuming you have the storage pool set to collocate by group as well.
Something like this would move the node's data to fresh tapes in the same
storage pool:

MOVE NODEDATA some_node FROMSTGPOOL=some_pool

As Marc said, though, as long as collocation at the pool level is set to
either NODE or GROUP, there is no difference between a node being in no
collocation group, and a node being in a collocation group that has no
other nodes.

As for starting a full backup, the TSM terminology is somewhat confusingly
"selective backup". I'm not sure if this will start a VSS backup but it
will backup all the eligible files in your backup domain.

On Thu, Nov 09, 2017 at 03:57:43PM +, Drammeh, Jennifer C wrote:
> Marc,
>
>I removed the node from the collocation group and added it to the new 
> collocation group - where it resides by itself. Will the move nodedata 
> command move the data from the tapes in the old collocation group to tapes in 
> the new collocation group? Also, do you know how to initiate a FULL new 
> backup of the entire system.
>
>
> Thanks!
>
>
> Jennifer
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Marc 
> Lanteigne
> Sent: Friday, November 03, 2017 3:07 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] Question on Collocation
>
> Hi Jennifer,
>
> The backup always only applies to the filesystem backup, not systemstate. 
> You???ll have to change the mode of the systemstate backup to get a full.
>
> Or use move nodedata to move it all together.
>
> BTW, you would just have needed to remove the node from the collocation 
> group. Nodes not in a collocation group are collocated by node.
>
> Marc...
>
> Sent from my iPhone using IBM Verse
>
> On Nov 3, 2017, 6:11:07 PM, jdram...@u.washington.edu wrote:
>
> From: jdram...@u.washington.edu
> To: ADSM-L@VM.MARIST.EDU
> Cc:
> Date: Nov 3, 2017, 6:11:07 PM
> Subject: [ADSM-L] Question on Collocation
>
>
>  I use collocation and I have a node where the System Admin has requested 
> that his data from a particular node be isolated on tapes by itself. I 
> created a collocation group and have associated this one node with the new 
> group. This node had previously been backing up to a different collocation 
> group. I created a new Policy Domain and associated the node with it as well. 
> This policy domain is set to send the data direct to tape instead of going to 
> our diskpool. I also modified the settings on the server to allow this system 
> to have 2 mount points.
>  I had the SA launch a "full" manual backup from the GUI - using the "Always 
> Backup" option. Here are the problems I am seeing.
>  1.   I can see that the system state data when to a tape that was 
> already used by other nodes (which tells me the collocation is not working)
>  2.   It only mounted 1 tape in a drive while performing the backup
>  3.   The metrics at the end showed that it inspected 283 GB of data and 
> only transferred 175 GB which means it did not actually perform a full backup.
>  Any ideas to help get a FULL

Re: Question on Collocation

2017-11-09 Thread Skylar Thompson

 Content preview:  Hmm... Another possibility I can think of is churn while 
you're
running the backup. If you have your management class's copy serialization
set to SHRDYNAMIC, SHRSTATIC, or STATIC, the client will try to determine
if a file is changing while the backup is happening (multiple times in the
case of SHRSTATIC) and either tell the server not to commit the data, or
   retry the transfer. If there's a retry, I'm not sure how it impacts the bytes
transferred report at the end. [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1510242332
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 2835
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.44665
Rule breakdown below
 pts rule name  description
 -- 
--

Hmm... Another possibility I can think of is churn while you're running the
backup. If you have your management class's copy serialization set to
SHRDYNAMIC, SHRSTATIC, or STATIC, the client will try to determine if a
file is changing while the backup is happening (multiple times in the case
of SHRSTATIC) and either tell the server not to commit the data, or retry
the transfer. If there's a retry, I'm not sure how it impacts the bytes
transferred report at the end.

On Thu, Nov 09, 2017 at 03:30:59PM +, Drammeh, Jennifer C wrote:
> Skylar,
>
>Sorry for the delayed response! No client side compression enabled.
>
>
> Jennifer
>
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Skylar Thompson
> Sent: Friday, November 03, 2017 2:49 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] Question on Collocation
>
> Hi Jennifer,
>
> Do you have client-side compression enabled? I'm not familiar with what the 
> GUI reports, but dsmc will show this in the "Objects compressed by"
> line.
>
> On 11/03/2017 02:06 PM, Drammeh, Jennifer C wrote:
> > I use collocation and I have a node where the System Admin has requested 
> > that his data from a particular node be isolated on tapes by itself. I 
> > created a collocation group and have associated this one node with the new 
> > group. This node had previously been backing up to a different collocation 
> > group. I created a new Policy Domain and associated the node with it as 
> > well. This policy domain is set to send the data direct to tape instead of 
> > going to our diskpool. I also modified the settings on the server to allow 
> > this system to have 2 mount points.
> >
> > I had the SA launch a "full" manual backup from the GUI - using the "Always 
> > Backup" option. Here are the problems I am seeing.
> >
> >
> > 1.   I can see that the system state data when to a tape that was 
> > already used by other nodes (which tells me the collocation is not working)
> >
> > 2.   It only mounted 1 tape in a drive while performing the backup
> >
> > 3.   The metrics at the end showed that it inspected 283 GB of data and 
> > only transferred 175 GB which means it did not actually perform a full 
> > backup.
> >
> > Any ideas to help get a FULL backup of this data alone on a single 1
> > or 2 tapes and ideally using multiple tape drives to speed things up?
> > (for backup and restore)
> >
> > Thanks!
> >
> >
> > Jennifer
>
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: log not flushing

2017-11-09 Thread Skylar Thompson

 Content preview:  Hi Remco, This sounds like database reorganization to me.
   You can verify by looking for ANR0293I ("reorganization for table ... 
started")
and ANR0294I ("reorganization for table ... ended") messages in the activity
log. There are some server option tunings you can do to control it 
(ALLOWREORGTABLE,
ALLOWREORGINDEX, REORGBEGINTIME, REORGDURATION) but in general it's a good
thing for long-term performance and disk utilization. When the server is
   otherwise idle is actually the best time to do the reorgs, since it won't
   interfere with your production loads. [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1510238307
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 1450
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.44663
Rule breakdown below
 pts rule name  description
 -- 
--

Hi Remco,

This sounds like database reorganization to me. You can verify by looking
for ANR0293I ("reorganization for table ... started") and ANR0294I
("reorganization for table ... ended") messages in the activity log. There
are some server option tunings you can do to control it (ALLOWREORGTABLE,
ALLOWREORGINDEX, REORGBEGINTIME, REORGDURATION) but in general it's a good
thing for long-term performance and disk utilization. When the server is
otherwise idle is actually the best time to do the reorgs, since it won't 
interfere
with your production loads.

On Thu, Nov 09, 2017 at 12:57:05PM +0100, Remco Post wrote:
> Hi All,
>
> for my current customer we???ve build a number of SP 8.1.1 servers. One thing 
> that is new is that sometimes the server doesn???t empty the DB2 active log 
> after a full database backup. The only way it seems that the log gets flushed 
> is by restarting the TSM server. I???ve seen this now 2 or 3 times in the few 
> months that we???re actively using these servers and it makes me feel uneasy. 
> And no, it???s not like the server is very busy, the server was completely 
> idle as far as I can see.
>
> --
>
>  Met vriendelijke groeten/Kind Regards,
>
> Remco Post
> r.p...@plcs.nl
> +31 6 248 21 622

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Question on Collocation

2017-11-03 Thread Skylar Thompson

Hi Jennifer,

Do you have client-side compression enabled? I'm not familiar with what
the GUI reports, but dsmc will show this in the "Objects compressed by"
line.

On 11/03/2017 02:06 PM, Drammeh, Jennifer C wrote:
> I use collocation and I have a node where the System Admin has requested that 
> his data from a particular node be isolated on tapes by itself. I created a 
> collocation group and have associated this one node with the new group. This 
> node had previously been backing up to a different collocation group. I 
> created a new Policy Domain and associated the node with it as well. This 
> policy domain is set to send the data direct to tape instead of going to our 
> diskpool. I also modified the settings on the server to allow this system to 
> have 2 mount points.
>
> I had the SA launch a "full" manual backup from the GUI - using the "Always 
> Backup" option. Here are the problems I am seeing.
>
>
> 1.   I can see that the system state data when to a tape that was already 
> used by other nodes (which tells me the collocation is not working)
>
> 2.   It only mounted 1 tape in a drive while performing the backup
>
> 3.   The metrics at the end showed that it inspected 283 GB of data and 
> only transferred 175 GB which means it did not actually perform a full backup.
>
> Any ideas to help get a FULL backup of this data alone on a single 1 or 2 
> tapes and ideally using multiple tape drives to speed things up? (for backup 
> and restore)
>
> Thanks!
>
>
> Jennifer


--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354

Re: Magic Decoder Ring needed

2017-10-11 Thread Skylar Thompson

 Content preview:  I'm not aware of a fix for the problem (it's with Dell PERC
H810s) but the problem manifested itself in lots and lots of media errors
on a physical device, visible when you export the controller log. The 
symptoms
for TSM included both CRC errors in the pool and also sporadically awful
   I/O throughput. [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1507728824
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 5262
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.43799
Rule breakdown below
 pts rule name  description
 -- 
--

I'm not aware of a fix for the problem (it's with Dell PERC H810s) but the
problem manifested itself in lots and lots of media errors on a physical
device, visible when you export the controller log. The symptoms for TSM
included both CRC errors in the pool and also sporadically awful I/O
throughput.

The controller logs identified the slot with the media errors, and
replacing the drive made all the above problems go away. Of course the real
solution is going to be retiring these soon-to-be-EOSL'd devices, and I've
finally got a budget to do it...

I'm not actually aware of a fix for the problem, though I didn't spend a
lot of time looking for one given that we'll be getting rid of the
equipment in a few weeks. It could very well be an interaction between the
RAID HBA and physical disk firmware. Unfortunately the system has a mix of
disk vendors since Dell isn't consistent about which vendor they ship for
replacements, but the drive I identified was a Fujitsu MBD2300RC.

On Tue, Oct 10, 2017 at 02:18:01PM -0400, Zoltan Forray wrote:
> Thank you for the info.  We have started running AUDIT's but with 30TB+ in
> this disk stgpool, it will take a while.  I am very interested in
> additional details on the RAID firmware issue you mentioned - any specifics
> would be very helpful.  AFAIK, we are up-to-date on all Dell firmware (we
> patch fairly regularly).
>
> Within the past 9-months, this server has had 3-diskpool volumes (all part
> of RAID-5 arrays) suddenly become "bad", requiring full restores, with no
> explanation since there was no sign of hardware problems. While I did open
> a PMR with IBM, by the time they looked at my last failure, they said there
> was nothing they could do to analyze the problem and to call them back the
> next time it happens.
>
> On Tue, Oct 10, 2017 at 2:04 PM, Skylar Thompson <skyl...@u.washington.edu>
> wrote:
>
> > Hi Zoltan,
> >
> > We ran into this recently, and it was caused by a firmware bug in a RAID
> > adapter that caused it not to fail and obviously-failing disk in our disk
> > spool. We followed the procedure here:
> >
> > https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.
> > 6/tshoot/r_pdg_1330_1331_msg.html
> >
> > It did take a few AUDIT VOLUME-MOVE DATA cycles to find everything but now
> > it's happy. In a few cases, the file shown by SHOW INVO was obviously
> > detritus, so we deleted it client-side with DELETE BACKUP instead of an
> > audit, because it takes a long time to audit our disk volumes.
> >
> > On Tue, Oct 10, 2017 at 01:56:47PM -0400, Zoltan Forray wrote:
> > > Recently we started seeing these errors on one of our servers:
> > >
> > > 10/10/2017 13:35:51  ANR1330E The server has detected possible corruption
> > > in
> > >   an object that is being restored or moved. The
> > actual
> > >
> > >   values for the incorrect frame are: magic 53454652
> > > hdr
> > >   version2 hdr length32 sequence number
> > >  22610
> > >   data length3FFB0 server ID0 segment ID
> > >
> > >   2720223190 crc0. (SESSION: 39218, PROCESS:
> > > 171)
> > > 10/10/2017 13:35:51  ANR1331E Invalid frame detected.  Expected magic
> > > 53454652
> > >
> > > The Process ID po

Re: Magic Decoder Ring needed

2017-10-10 Thread Skylar Thompson

Hi Zoltan,

We ran into this recently, and it was caused by a firmware bug in a RAID
adapter that caused it not to fail and obviously-failing disk in our disk
spool. We followed the procedure here:

https://www.ibm.com/support/knowledgecenter/en/SSGSG7_7.1.6/tshoot/r_pdg_1330_1331_msg.html

It did take a few AUDIT VOLUME-MOVE DATA cycles to find everything but now
it's happy. In a few cases, the file shown by SHOW INVO was obviously
detritus, so we deleted it client-side with DELETE BACKUP instead of an
audit, because it takes a long time to audit our disk volumes.

On Tue, Oct 10, 2017 at 01:56:47PM -0400, Zoltan Forray wrote:
> Recently we started seeing these errors on one of our servers:
>
> 10/10/2017 13:35:51  ANR1330E The server has detected possible corruption
> in
>   an object that is being restored or moved. The actual
>
>   values for the incorrect frame are: magic 53454652
> hdr
>   version2 hdr length32 sequence number
>  22610
>   data length3FFB0 server ID0 segment ID
>
>   2720223190 crc0. (SESSION: 39218, PROCESS:
> 171)
> 10/10/2017 13:35:51  ANR1331E Invalid frame detected.  Expected magic
> 53454652
>
> The Process ID points to a Backup Stgpool process (the only thing running),
> not anything being "moved or restored".  There are also a bunch of sessions
> running/stuck/hung but that is a different problem.
>
> Any idea on how to determine what is causing this?  We've seen the error
> quite a few times within the past few days.
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://phishing.vcu.edu/

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: 7.1.8/8.1.3 Security Upgrade Install Issues

2017-10-09 Thread Skylar Thompson

 Content preview:  I definitely agree with this; at least for TSM v7 it would
have been far better to call it v7.2.0 to make it clear that it's a huge
   change with lots of caveats and potential failure points. We've just now 
discovered
that TSM v7.1.8 does not play nicely with GPFS/mmbackup due to a change in
how SSL certificates are loaded - hopefully it's a simple fix but who 
knows...
[...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1507565699
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 1182
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.43739
Rule breakdown below
 pts rule name  description
 -- 
--

I definitely agree with this; at least for TSM v7 it would have been far
better to call it v7.2.0 to make it clear that it's a huge change with lots
of caveats and potential failure points. We've just now discovered that TSM
v7.1.8 does not play nicely with GPFS/mmbackup due to a change in how SSL
certificates are loaded - hopefully it's a simple fix but who knows...

On Sat, Oct 07, 2017 at 02:36:13PM -0500, Roger Deschner wrote:
> This difficulty comes up while there are open, now-published security
> vulnerabilities out there inviting exploits, and making our Security
> people very nervous. But the considerations described in
> http://www-01.ibm.com/support/docview.wss?uid=swg22004844 make it very
> difficult and risky to proceed with 7.1.8/8.1.3 as though it was just a
> patch. It's a major upgrade, requiring major research and planning, with
> the threat of an exploit constantly hanging over our heads. I really
> wish this had been handled differently.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: 7.1.8/8.1.3 Security Upgrade Install Issues

2017-10-06 Thread Skylar Thompson

're plunging ahead regardless, because of a general policy to apply
> patches quickly for all published security issues. (Like Equifax didn't
> do for Apache.) I'm trying to figure this out fast, because we're doing
> it this coming weekend. I'm sure there are parts of this I don't
> understand. I'm trying to figure out how ugly it's going to be.
>
> Roger Deschner  University of Illinois at Chicago rog...@uic.edu
> ==I have not lost my mind -- it is backed up on tape somewhere.=

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: rebinding objects for API client archives

2017-09-20 Thread Skylar Thompson

 Content preview:  Yep that's correct. We actually create a separate management
class for every archive because we know our clients will want to change it.
Fortunately TSM supports an unlimited number of management classes... [...]


 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1505914253
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.27:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 2739
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.43144
Rule breakdown below
 pts rule name  description
 -- 
--

Yep that's correct. We actually create a separate management class for
every archive because we know our clients will want to change it.
Fortunately TSM supports an unlimited number of management classes...

On Wed, Sep 20, 2017 at 12:58:40PM +, Rhodes, Richard L. wrote:
> I think I'm confused (not unusual!).
>
> Rebinding is CHANGING the MgtClass an object is bound to, NOT changing the 
> policies within a MgtClass.
>
> If we just change the policies of the MgtClass to retain the archive for 12 
> years, that WILL work.  What WON'T work is creating a NEW MgtClass with 12 
> year retention and trying to rebind to that.
>
> The Archive table in db2 has CLASS_NAME, that's what can't change.  But the 
> policies in the CLASS_NAME can change.
>
> Of course, changing the policies effect everyone using that MgtClass, but we 
> can live with that . . . . I think.
>
>
> Ok, is that right?
>
>
> Thanks!
>
> Rick
>
>
>
>
>
> -Original Message-
> From: Rhodes, Richard L.
> Sent: Wednesday, September 20, 2017 8:19 AM
> To: ADSM-L@VM.MARIST.EDU
> Cc: Ake, Elizabeth K. <a...@firstenergycorp.com>
> Subject: rebinding objects for API client archives
>
> We have a TSM instance that is dedicated to "applications".  These are 
> applications that use the API client to store some kind of object in TSM.  An 
> example is IBM Content Manager.  Nodes are in domains with a default Mgt 
> Class with 7 years retention, and store their data as Archives, although one 
> application uses Backups.
>
> We received a legal hold request to keep some of this data for 12 years.
>
> It's my understanding that you cannot rebind archive data.  That is, changing 
> the MgtClass will do nothing to keep this data longer.
>
> Do we have to change the Mgt Class and have the users retrieve/re-archive the 
> data?
>
>
> Is this correct?  Thoughts?
>
> Thoughts?
>
> Rick
> --
>
> The information contained in this message is intended only for the personal 
> and confidential use of the recipient(s) named above. If the reader of this 
> message is not the intended recipient or an agent responsible for delivering 
> it to the intended recipient, you are hereby notified that you have received 
> this document in error and that any review, dissemination, distribution, or 
> copying of this message is strictly prohibited. If you have received this 
> communication in error, please notify us immediately, and delete the original 
> message.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Automatic password change gets lost between server and client

2017-09-19 Thread Skylar Thompson

 https://na01.safelinks.protection.outlook.com/?url=www.ucc.vcu.edu=02%7C01%7Cglee%40BSU.EDU%7C2f7c63f465c94719a09608d4ff7c3f80%7C6fff909f07dc40da9e30fd7549c0f494%7C0%7C0%7C636414356483419375=zcQ93GWJMlvxHNGcwY7mHvQxyBCTNGmIIyn5cFZ06JE%3D=0
> > zfor...@vcu.edu - 804-828-4807
> > Don't be a phishing victim - VCU and other reputable organizations will
> > never use email to request that you reply with your password, social
> > security number or confidential personal information. For more details
> > visit 
> > https://na01.safelinks.protection.outlook.com/?url=http%3A%2F%2Finfosecurity.vcu.edu%2Fphishing.html=02%7C01%7Cglee%40BSU.EDU%7C2f7c63f465c94719a09608d4ff7c3f80%7C6fff909f07dc40da9e30fd7549c0f494%7C0%7C0%7C636414356483419375=MDF1Xoo4IZQo7g6Dsdnep%2FVEslJTSueO4Mv6shBI0%2Fs%3D=0
> >

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Backup to LTO7 tapes

2017-09-11 Thread Skylar Thompson

 Content preview:  We backup genomic data directly to LTO7 (100+ GB files are
pretty ideal for it). I think the most important thing is that you set your
TXNBYTELIMIT as high as possible (we do this via server-side client option
set). You can also set the server option TXNGROUPMAX higher than the default
of 4096 but this won't have any effect if your files are already bigger than
TXNBYTELIMIT. [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1505137906
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 1059
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.42816
Rule breakdown below
 pts rule name  description
 -- 
--

We backup genomic data directly to LTO7 (100+ GB files are pretty ideal for
it). I think the most important thing is that you set your TXNBYTELIMIT as
high as possible (we do this via server-side client option set). You can
also set the server option TXNGROUPMAX higher than the default of 4096 but
this won't have any effect if your files are already bigger than
TXNBYTELIMIT.

Finally, if you have a mix of large and small files, make sure the small
files end up in a FILE or DISK pool before getting migrated to tape.

On Wed, Sep 06, 2017 at 06:23:39AM +, rou...@univ.haifa.ac.il wrote:
> Hi to all
>
> A question about to backup directly to LTO7 tapes (format= ULTRIUM7C) for  
> audio / video data, want to know If it?s special options to add in dsm.opt to 
> improve performance.
>
> TSM client version 7.1.6.5
> TSM server version 8.1.1.0

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: error retrieving archives

2017-08-23 Thread Skylar Thompson

 Content preview:  Is the original filesystem different from the destination
   filesystem? It looks like the destination filesystem is having issues with
the extended attributes TSM has stored for the archived files. On Wed, Aug
23, 2017 at 02:51:34PM -0400, David Jelinek wrote: > 08/23/2017 10:15:58
   ANS1587W Unable to read extended attributes for > object 
/mnt/tsm/linux/aux-jet/shared/data1/20161002/BMW
due to errno: > 34, reason: Numerical result out of range > 08/23/2017 
10:15:58
ANS0361I DIAG: TransErrno: Unexpected error from > psGetXattrNameList, errno
= 34 > > The linux machine I am attempting to retrieve on is not the same
> physical machine, but is the same node name. The retrieve is being done
> from root (the archive was also done from root on the former machine).
   > > Many files retrieve ok I am retrieving to a new location. The retrieve
> command is: > > dsmc retrieve "/shared/data1/*" > 
/mnt/tsm/linux/aux-jet/shared/data1/20161204/
-subdir=yes > -tod=12/31/2016 -fromdate=11/08/2016 > > The directories are
shared out from the original box to a number of system. > > Is there any
   way for me to retrieve these files? > > > -- > Have a wonderful day, > David
Jelinek [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1503514803
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 1404
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.42237
Rule breakdown below
 pts rule name  description
 -- 
--

Is the original filesystem different from the destination filesystem? It
looks like the destination filesystem is having issues with the extended
attributes TSM has stored for the archived files.

On Wed, Aug 23, 2017 at 02:51:34PM -0400, David Jelinek wrote:
> 08/23/2017 10:15:58 ANS1587W Unable to read extended attributes for
> object /mnt/tsm/linux/aux-jet/shared/data1/20161002/BMW due to errno:
> 34, reason: Numerical result out of range
> 08/23/2017 10:15:58 ANS0361I DIAG: TransErrno: Unexpected error from
> psGetXattrNameList, errno = 34
>
> The linux machine I am attempting to retrieve on is not the same
> physical machine, but is the same node name. The retrieve is being done
> from root (the archive was also done from root on the former machine).
>
> Many files retrieve ok I am retrieving to a new location. The retrieve
> command is:
>
> dsmc retrieve "/shared/data1/*"
> /mnt/tsm/linux/aux-jet/shared/data1/20161204/ -subdir=yes
> -tod=12/31/2016 -fromdate=11/08/2016
>
> The directories are shared out from the original box to a number of system.
>
> Is there any way for me to retrieve these files?
>
>
> --
> Have a wonderful day,
> David Jelinek

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Simultaneous copy and compatible data formats/device types

2017-08-15 Thread Skylar Thompson

 Content preview:  Hi TSMers, We're experimenting with simultaneous copy for
   one of our TSM servers, which gets very few, but very large, files every day
from one of its clients (1 - 2 files, each 10GB - 1TB). It also has
clients that send millions of small files, so we'd like the big data to end
up directly on tape so that our disk spool can accept the small files. We
use LTO7 for onsite data and LTO6 for offsite data. [...]

 Content analysis details:   (0.7 points, 5.0 required)

  pts rule name  description
  -- --
  0.7 SPF_NEUTRALSPF: sender does not match SPF record (neutral)
 -0.0 RP_MATCHES_RCVDEnvelope sender domain matches handover relay 
domain
X-Barracuda-Connect: mx.gs.washington.edu[128.208.8.134]
X-Barracuda-Start-Time: 1502832991
X-Barracuda-Encrypted: ECDHE-RSA-AES256-GCM-SHA384
X-Barracuda-URL: https://148.100.49.28:443/cgi-mod/mark.cgi
X-Barracuda-Scan-Msg-Size: 3072
X-Virus-Scanned: by bsmtpd at marist.edu
X-Barracuda-BRTS-Status: 1
X-Barracuda-Spam-Score: 0.00
X-Barracuda-Spam-Status: No, SCORE=0.00 using global scores of TAG_LEVEL=3.5 
QUARANTINE_LEVEL=1000.0 KILL_LEVEL=5.5 tests=
X-Barracuda-Spam-Report: Code version 3.2, rules version 3.2.3.41979
Rule breakdown below
 pts rule name  description
 -- 
--

Hi TSMers,

We're experimenting with simultaneous copy for one of our TSM servers,
which gets very few, but very large, files every day from one of its
clients (1 - 2 files, each 10GB - 1TB). It also has clients that
send millions of small files, so we'd like the big data to end up directly
on tape so that our disk spool can accept the small files. We use LTO7 for
onsite data and LTO6 for offsite data.

Our storage hierarchy looks like this:

DISK-BK-PRI-LTO7 - This is the top-level disk storage pool. Relevant
configuration:
   * DEVCLASS: DISK
   * NEXTSTGPOOL: DESPOT-ONSITE-LTO7
   * AUTOCOPY: MIGRATION
   * COPYSTGPOOLS: DESPOT-OFFSITE-LTO6
   * MAXSIZE: 10G
   * DATAFORMAT: NATIVE

DESPOT-ONSITE-LTO7 - This is the primary tape storage pool. Relevant
configuration:
   * DEVCLASS: DESPOT-LTO7
   * NEXTSTGPOOL: NONE
   * AUTOCOPY: ALL
   * COPYSTGPOOLS: DESPOT-OFFSITE-LTO6
   * MAXSIZE: NONE
   * DATAFORMAT: NATIVE

DESPOT-OFFSITE-LTO6 - This is the copy tape storage pool
   * DEVCLASS: DESPOT-LTO6
   * DATAFORMAT: NATIVE

Our device classes look like this:

DESPOT-LTO7:
   * DEVTYPE: LTO
   * FORMAT: ULTRIUM7C
DESPOT-LTO6:
   * DEVTYPE: LTO
   * FORMAT: ULTRIUM6C

It seems like this should work, yet we get messages like this when running
migrations from DISK-BK-PRI-LTO7 to DESPOT-ONSITE-LTO7 and
DESPOT-OFFSITE-LTO6:

08/06/2017 16:41:18  ANR1927W Autocopy process 9138 stopped for storage pool
  DESPOT-OFFSITE-LTO6. The data format or device type 
was
  not compatible with the data format or device type of
  the primary pool. (SESSION: 116305, PROCESS: 9138)

Oddly, though, we don't see similar messages for client backups that
cut-through DISK-BK-PRI-LTO7 directly to tape due to its MAXSIZE setting,
so it seems like simultaneous copy is working for clients but not for
migrations.

I'm trying to figure out if this error message is referring strictly to the
storage pool DATAFORMAT parameter and device class DEVTYPE parameter, or if
it is more general, and I can't find anything one way or the other. IBM has
this documentation:

https://www.ibm.com/support/knowledgecenter/SSGSG7_7.1.0/com.ibm.itsm.srv.doc/c_simulwrite_limitations.html

Which says that you can use different device classes for simultaneous copy
as long as the data formats are "compatible" but never defines
"compatible". I have a PMR open with IBM, and they're claiming that
different LTO generations are incompatible, but I can't find any evidence
that that's accurate, especially given that client simultaneous copy is
working.

So I guess my question is - what have other people's experiences and
expectations been with simultaneous copy? Have you gotten a similar setup
to work properly?

Thanks!

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: ndmp

2017-08-01 Thread Skylar Thompson

I agree, we use this approach as well. NDMP has scaling issues, doesn't
play nicely with TSM incremental backups or include/exclude lists, and ties
you into a single storage vendor for both backups and restores. That last
point is particularly scary for anyone writing DR plans, since who knows
what storage you'll end up with after a real disaster.

On Tue, Aug 01, 2017 at 09:27:06PM +, Thomas Denier wrote:
> You might be better off having proxy systems access the NAS contents using 
> CIFS and/or NFS, and having the proxy systems use the backup/archive client 
> to back up the NAS contents.
>
> My department supports Commvault as well as TSM (the result of a merger of 
> previously separate IT organizations). The Commvault workload includes a NAS 
> server on the same scale as yours. Our Commvault representative advised us to 
> forget about Commvault's NDMP support and use the Commvault analog of the 
> approach described in the previous paragraph.
>
> The subject of NAS backup coverage arose at an IBM training/marketing event 
> for the Spectrum family of products. The IBM representative who responded was 
> not as bluntly dismissive of NDMP as our Commvault representative, but he 
> sounded decidedly unenthusiastic when he mentioned NDMP as a possible 
> approach to NAS backups.
>
> Thomas Denier,
> Thomas Jefferson University
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Remco Post
> Sent: Monday, July 31, 2017 16:41
> To: ADSM-L@VM.MARIST.EDU
> Subject: [ADSM-L] ndmp
>
> Hi all,
>
> I???m working on a large TSM implementation for a customer who also has HDS 
> NAS systems, and quite some data in those systems, more than 100 TB that 
> needs to be backed up. We were planning to go 100% directory container for 
> the new environment, but alas IBM???s ???best of both worlds" (DISK & FILE) 
> doesn???t support NDMP and I don???t like FILE with deduplication (too much 
> of a hassle), so is it really true, are we really stuck with tape? ISn???t it 
> about time after so many years that IBM finally gives us a decent solution to 
> backup NAS systems?
>
> --
>
>  Met vriendelijke groeten/Kind Regards,
>
> Remco Post
> r.p...@plcs.nl
> +31 6 248 21 622
> The information contained in this transmission contains privileged and 
> confidential information. It is intended only for the use of the person named 
> above. If you are not the intended recipient, you are hereby notified that 
> any review, dissemination, distribution or duplication of this communication 
> is strictly prohibited. If you are not the intended recipient, please contact 
> the sender by reply email and destroy all copies of the original message.
>
> CAUTION: Intended recipients should NOT use email communication for emergent 
> or urgent health care matters.
>

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Non-default management classes with mmbackup

2017-04-03 Thread Skylar Thompson

I've figured this problem out - after poring over the documentation, it turns 
out that updating
INCLEXCL rules requires running "mmbackup -q" to update its shadow
database.

On Mon, Apr 03, 2017 at 02:37:59PM +0000, Skylar Thompson wrote:
> I'm wondering if anyone has experience assigning non-default management
> classes to new files via mmbackup (I know that it's not possible to rebind
> existing files). Until recently, we've gotten away with just using the
> policy domain default management class, but the documentation here suggests
> that it should be possible to assign non-default management classes as
> well:
>
> http://www-01.ibm.com/support/docview.wss?uid=swg21699569
>
> Unfortunately, when I've tested it, all I get is the default. For
> instance, we have this rule in dsm.sys to assign all files in
> /gpfs/gs1/noble in directories marked "backups" to the noble_lab
> management class:
>
> SErvername  gpfs-gs1-noble-vol3
>...
>include /gpfs/gs1/noble/.../backups/.../* noble_lab
>
> And I make a test file:
>
> dd if=/dev/zero of=/gpfs/gs1/noble/vol3/backups/test7 bs=1M count=1000
>
> I create a snapshot (noble_vol3 is in a fileset so we can manage quotas for
> it):
>
> # mmcrsnapshot gs1 noble_vol3:TSMsnap
>
> And run mmbackup:
>
> # mmbackup /gpfs/gs1/noble/vol3 \
> -t incremental \
> --tsm-servers gpfs-gs1-noble-vol3 \
> -N tsm_clients \
> -S TSMsnap \
> --tsm-errorlog /var/log/dsmerror-gpfs-gs1-noble-vol3.log \
> -a 4 \
> --expire-threads 4 \
> --backup-threads 4 \
> --scope inodespace
>
> Querying the backups afterwards shows that it's in the DEFAULT management
> class, not NOBLE_LAB:
>
> # dsmc q b -se=gpfs-gs1-noble-vol3 /gpfs/gs1/noble/vol3/backups/test7
> ...
>SizeBackup DateMgmt Class   A/I 
> File
>-----   --- 
> 
>104,857,600  B  04/03/2017 07:25:53 DEFAULT  A  
> /gpfs/gs1/noble/vol3/backups/test7
>
> But I know that rule is valid, because I can run "dsmc incremental" 
> afterwards and thefile gets rebound:
>
> # dsmc i -se=gpfs-gs1-noble-vol3 /gpfs/gs1/noble/vol3/backups/
> ...
> Rebinding--> 104,857,600 /gpfs/gs1/noble/vol3/backups/test7 [Sent]
>
> # dsmc q b -se=gpfs-gs1-noble-vol3 /gpfs/gs1/noble/vol3/backups/test7
> ...
>SizeBackup DateMgmt Class   A/I 
> File
>    ---------   --- 
> 
>104,857,600  B  04/03/2017 06:48:24NOBLE_LAB A  
> /gpfs/gs1/noble/vol3/backups/test7
>
> Thanks!
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: LTO tape label problem

2017-04-03 Thread Skylar Thompson

Hi Andy,

I'm not familiar with the TS3500 library, but if TSM can see the bulk
import/export slots (definitely possible if it's SCSI-connected), then you
could do a check-in with CHECKLABEL=YES rather than CHECKLABEL=BARCODE.
This will have TSM mount each cartridge and read the tape label, rather
than trust the barcode reader.

Failing that, I wonder if you could setup a library partition to place
these volumes in, and do something similar with SEARCH=YES.

On Fri, Mar 31, 2017 at 04:14:52PM +, Huebner, Andy wrote:
> We have recently received a number of tapes from a remote site that have tape 
> labels that are not compatible with out TS3500.  They are too long and the 
> paper label does not exactly match the electronic label.
>
> We have found we can hand mount a tape and see the soft label.
>
> The problem is the soft is not a valid label for the library.  An example is 
> LTO00021L, we have all of the 20 series.  Truncating leaves me with a bunch 
> of LTO0002 tapes.
>
> Short of hand mounting many tapes, how can I have TSM mount and read these 
> tapes?
>
> Thank you,
>
> Andy Huebner

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Non-default management classes with mmbackup

2017-04-03 Thread Skylar Thompson

I'm wondering if anyone has experience assigning non-default management
classes to new files via mmbackup (I know that it's not possible to rebind
existing files). Until recently, we've gotten away with just using the
policy domain default management class, but the documentation here suggests
that it should be possible to assign non-default management classes as
well:

http://www-01.ibm.com/support/docview.wss?uid=swg21699569

Unfortunately, when I've tested it, all I get is the default. For
instance, we have this rule in dsm.sys to assign all files in
/gpfs/gs1/noble in directories marked "backups" to the noble_lab
management class:

SErvername  gpfs-gs1-noble-vol3
   ...
   include /gpfs/gs1/noble/.../backups/.../* noble_lab

And I make a test file:

dd if=/dev/zero of=/gpfs/gs1/noble/vol3/backups/test7 bs=1M count=1000

I create a snapshot (noble_vol3 is in a fileset so we can manage quotas for
it):

# mmcrsnapshot gs1 noble_vol3:TSMsnap

And run mmbackup:

# mmbackup /gpfs/gs1/noble/vol3 \
-t incremental \
--tsm-servers gpfs-gs1-noble-vol3 \
-N tsm_clients \
-S TSMsnap \
--tsm-errorlog /var/log/dsmerror-gpfs-gs1-noble-vol3.log \
-a 4 \
--expire-threads 4 \
--backup-threads 4 \
--scope inodespace

Querying the backups afterwards shows that it's in the DEFAULT management
class, not NOBLE_LAB:

# dsmc q b -se=gpfs-gs1-noble-vol3 /gpfs/gs1/noble/vol3/backups/test7
...
   SizeBackup DateMgmt Class   A/I File
   -----   --- 
   104,857,600  B  04/03/2017 07:25:53 DEFAULT  A  
/gpfs/gs1/noble/vol3/backups/test7

But I know that rule is valid, because I can run "dsmc incremental" afterwards 
and thefile gets rebound:

# dsmc i -se=gpfs-gs1-noble-vol3 /gpfs/gs1/noble/vol3/backups/
...
Rebinding--> 104,857,600 /gpfs/gs1/noble/vol3/backups/test7 [Sent]

# dsmc q b -se=gpfs-gs1-noble-vol3 /gpfs/gs1/noble/vol3/backups/test7
...
   SizeBackup DateMgmt Class   A/I File
   -----   --- 
   104,857,600  B  04/03/2017 06:48:24NOBLE_LAB A  
/gpfs/gs1/noble/vol3/backups/test7

Thanks!

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Advice for archiving 80 billion of small files.

2017-01-20 Thread Skylar Thompson

Do you need to recover files individually? If so, then image backup (at
least on its own) won't be a good option. One thing you could do is tar up
chunks (maybe a million files) and archive/backup those chunks. Keep a
catalog (hopefully a database with indexes) of which files are in which tar 
balls, and
then when you go to restore you only have to recover 1/8 of your data
to get one file.

On Fri, Jan 20, 2017 at 02:18:04PM +, Bo Nielsen wrote:
> Hi all,
>
> I need advice.
> I must archive 80 billion small files, but that is not possible, as I see it.
> since it will fill in the TSM's Database about 73 Tb.
> The filespace is mounted on a Linux server.
> Is there a way to pack/zip the files, so it's a smaller number of files.
> anybody who has tried this ??
>
> Regards,
>
> Bo Nielsen
>
>
> IT Service
>
>
>
> Technical University of Denmark
>
> IT Service
>
> Frederiksborgvej 399
>
> Building 109
>
> DK - 4000 Roskilde
>
> Denmark
>
> Mobil +45 2337 0271
>
> boa...@dtu.dk<mailto:boa...@dtu.dk>

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: TSM Server on CentOS Linux

2017-01-20 Thread Skylar Thompson

We run TSM on CentOS6 x86_64 without problems, with HP LTO5 and LTO6, and
IBM LTO7 drives and a pair of ACSLS-managed STK SL3000 libraries. We
haven't had any problems - technical, support, or otherwise. I think the
only unusual thing we had to do was pass

-vmargs "-DBYPASS_TSM_REQ_CHECKS=true"

to install.sh for the initial install.

As Del notes, there might be some pushback on tape device support, but we
haven't had any problems so can't comment on how it actually plays out.

On Thu, Jan 19, 2017 at 06:10:46PM -0600, Roger Deschner wrote:
> Management here is contemplating having us move our production TSM
> servers to the CentOS Linux operating system, which is a free branch of
> Red Hat.
>
> Has anybody done this? What are the support issues with IBM?
>
> (TSM Client is already supported on CentOS via "Best effort".)
>
> Roger Deschner  University of Illinois at Chicago rog...@uic.edu
> ==I have not lost my mind -- it is backed up on tape somewhere.=

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: DECOMMISSION NODE

2017-01-12 Thread Skylar Thompson

Bummer, I guess I've never tried canceling the process. It sounds like a
bug to me, as other processes in TSM are supposed to be transactional and
incremental.

On Thu, Jan 12, 2017 at 03:44:24PM -0500, Zoltan Forray wrote:
> It won't continue.  When I canceled the processes, it said something about
> undoing the decommissioning.   However, when I just tried to restart the
> DECOMMISSION NODE, it errors saying the node is already decommissioned?
> Looking back in the logs, it says the decommission ended with "completion
> state of SUCCESS" (wrong) eventhough I canceled it.  Sounds like a bug to
> me.  Looks like I have to drag out the sledgehammer and delete it manually!
>
> On Thu, Jan 12, 2017 at 3:05 PM, Skylar Thompson <skyl...@u.washington.edu>
> wrote:
>
> > The scan portion probably will take the same amount of time, but the
> > heavy-hitting part (marking active objects as inactive) should pick up
> > where
> > it left off.
> >
> > On Thu, Jan 12, 2017 at 03:00:08PM -0500, Zoltan Forray wrote:
> > > This node has >230M objects (both offsite and onsite) and total occupancy
> > > of 12TB.  It got to ~80M when I had to kill it.  Sure wish I knew if it
> > was
> > > going to start all over again or pick-up where it left off?  I have more
> > > maintenance on this TSM server scheduled for Tuesday and if it starts all
> > > over again, it clearly won't finish by then.
> > >
> > > On Thu, Jan 12, 2017 at 2:20 PM, Matthew McGeary <
> > > matthew.mcge...@potashcorp.com> wrote:
> > >
> > > > Hello Zoltan,
> > > >
> > > > I use it every day, mostly because of changes to our VMware environment
> > > > (VMs seem to breed like rabbits and die like fruit flies.)  It never
> > seems
> > > > to take much time in those cases, but the object count and data stored
> > in
> > > > those cases isn't typically very large.
> > > >
> > > > I've never tried to decomm a node that is TB in size or one that
> > contains
> > > > millions of objects.
> > > > __
> > > > Matthew McGeary
> > > > Senior Technical Specialist ??? Infrastructure Management Services
> > > > PotashCorp
> > > > T: (306) 933-8921
> > > > www.potashcorp.com
> > > >
> > > >
> > > > -Original Message-
> > > > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf
> > Of
> > > > Zoltan Forray
> > > > Sent: Thursday, January 12, 2017 1:15 PM
> > > > To: ADSM-L@VM.MARIST.EDU
> > > > Subject: [ADSM-L] DECOMMISSION NODE
> > > >
> > > > Anyone out there using the DECOMMISSION NODE command?  I tried it on an
> > > > old, inactive node and after running for 4-days, I had to cancel it
> > due to
> > > > scheduled TSM server maintenance.
> > > >
> > > > My issue is, since it was only 35% finished (based on the number of
> > > > objects processed), will it start from the beginning or remember where
> > it
> > > > left off?
> > > >
> > > > --
> > > > *Zoltan Forray*
> > > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator Xymon
> > > > Monitor Administrator VMware Administrator (in training) Virginia
> > > > Commonwealth University UCC/Office of Technology Services
> > www.ucc.vcu.edu
> > > > zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and
> > other
> > > > reputable organizations will never use email to request that you reply
> > with
> > > > your password, social security number or confidential personal
> > information.
> > > > For more details visit http://infosecurity.vcu.edu/phishing.html
> > > >
> > >
> > >
> > >
> > > --
> > > *Zoltan Forray*
> > > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> > > Xymon Monitor Administrator
> > > VMware Administrator (in training)
> > > Virginia Commonwealth University
> > > UCC/Office of Technology Services
> > > www.ucc.vcu.edu
> > > zfor...@vcu.edu - 804-828-4807
> > > Don't be a phishing victim - VCU and other reputable organizations will
> > > never use email to request that you reply with your password, social
> > > security number or confidential personal information. For more details
> > > visit http://infosecurity.vcu.edu/phishing.html
> >

Re: DECOMMISSION NODE

2017-01-12 Thread Skylar Thompson

The scan portion probably will take the same amount of time, but the
heavy-hitting part (marking active objects as inactive) should pick up where
it left off.

On Thu, Jan 12, 2017 at 03:00:08PM -0500, Zoltan Forray wrote:
> This node has >230M objects (both offsite and onsite) and total occupancy
> of 12TB.  It got to ~80M when I had to kill it.  Sure wish I knew if it was
> going to start all over again or pick-up where it left off?  I have more
> maintenance on this TSM server scheduled for Tuesday and if it starts all
> over again, it clearly won't finish by then.
>
> On Thu, Jan 12, 2017 at 2:20 PM, Matthew McGeary <
> matthew.mcge...@potashcorp.com> wrote:
>
> > Hello Zoltan,
> >
> > I use it every day, mostly because of changes to our VMware environment
> > (VMs seem to breed like rabbits and die like fruit flies.)  It never seems
> > to take much time in those cases, but the object count and data stored in
> > those cases isn't typically very large.
> >
> > I've never tried to decomm a node that is TB in size or one that contains
> > millions of objects.
> > __
> > Matthew McGeary
> > Senior Technical Specialist ??? Infrastructure Management Services
> > PotashCorp
> > T: (306) 933-8921
> > www.potashcorp.com
> >
> >
> > -Original Message-
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of
> > Zoltan Forray
> > Sent: Thursday, January 12, 2017 1:15 PM
> > To: ADSM-L@VM.MARIST.EDU
> > Subject: [ADSM-L] DECOMMISSION NODE
> >
> > Anyone out there using the DECOMMISSION NODE command?  I tried it on an
> > old, inactive node and after running for 4-days, I had to cancel it due to
> > scheduled TSM server maintenance.
> >
> > My issue is, since it was only 35% finished (based on the number of
> > objects processed), will it start from the beginning or remember where it
> > left off?
> >
> > --
> > *Zoltan Forray*
> > Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator Xymon
> > Monitor Administrator VMware Administrator (in training) Virginia
> > Commonwealth University UCC/Office of Technology Services www.ucc.vcu.edu
> > zfor...@vcu.edu - 804-828-4807 Don't be a phishing victim - VCU and other
> > reputable organizations will never use email to request that you reply with
> > your password, social security number or confidential personal information.
> > For more details visit http://infosecurity.vcu.edu/phishing.html
> >
>
>
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator (in training)
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: DECOMMISSION NODE

2017-01-12 Thread Skylar Thompson

We recently ran it on a node with ~6 million objects, and it worked fine. I
think it ran for about an hour before completing, but it definitely
thrashed the database.

On Thu, Jan 12, 2017 at 02:14:37PM -0500, Zoltan Forray wrote:
> Anyone out there using the DECOMMISSION NODE command?  I tried it on an
> old, inactive node and after running for 4-days, I had to cancel it due to
> scheduled TSM server maintenance.
>
> My issue is, since it was only 35% finished (based on the number of objects
> processed), will it start from the beginning or remember where it left off?
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator (in training)
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: PCI and TSM

2017-01-05 Thread Skylar Thompson

If you use LTO (and probably the proprietary tape technologies as well),
you get drive encryption for free. The database backups store the
encryption key so you'll have to deal with those separately. You can also
get an out-of-band encryption appliance that talks to the tape drives,
which moves the key management problem outside TSM, but at an increase in
complexity.

As of TSM v6, you can also do SSL encryption between the TSM server and
clients. You would have to leverage some configuration management system to
do the certificate management.

As for data at rest in your disk pool, you could either mitigate that with
client-side encryption, or you could encrypt the filesystem at either the
OS or drive layer (self-encrypting drives don't cost much more than regular
drives).

I don't have experience with PCI, but we have NIST/FIPS requirements that
have been satisfied with tape and hard drive encryption, along with
physical security measures. At some point I'd like to roll out SSL as well
but haven't had time to do it.

On Thu, Jan 05, 2017 at 04:05:24PM -0500, Zoltan Forray wrote:
> I am looking for some guidelines / experience when it comes to the
> requirements for a TSM server to backup client servers that handles PCI
> (Payment Card Industry) data. I have no experience in this area and the
> person pushing/guiding this has very little experience.
>
> Besides the obvious of encrypting the backups from the user/client side,
> how do you handle things like making offsite copies (which are also
> encrypted) using tape?
>
> They are talking about setting up a new TSM server just to backup 12-PCI
> servers, on a separate, isolated network/subnet.  When I mentioned that the
> tape drives used to make the offsite copies is managed by a different TSM
> server, which would have to communicate with this isolated TSM server
> (eventhough the data is transferred via fibre), they didn't think that
> would be acceptable so now we are looking to get another tape drive to
> dedicate to this isolated server.
>
> In my opinion, this is overkill.
>
> Your thoughts/wisdom?
>
> --
> *Zoltan Forray*
> Spectrum Protect (p.k.a. TSM) Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator (in training)
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Mixed media scratch preferences in an ACSLS library

2016-11-30 Thread Skylar Thompson

Hi ADSMers,

I'm having some trouble with TSM not picking the right media type when
requesting a scratch tape for a storage pool in an ACSLS-managed library.
This library is a SL3000, with a mix of LTO6 and LTO7 drives, and LTO5,
LTO6, and LTO7 media.

We're using the "pseudo-partitioning" method, where we have a separate
library for each drive type (i.e. ACSLS-LIB1-LTO6 and ACSLS-LIB1-LTO7), and
separate device classes for each media type (LIB1-LTO5, LIB1-LTO6,
LIB1-LTO7). LIB1-LTO5 and LIB1-LTO6 both use ACSLS-LIB1-LTO6, while
LIB1-LTO7 uses ACSLS-LIB1-LTO7. The device class recording format is set to
a specific LTO generation: LIB1-LTO5 is set to ULTRIUM5C, LIB1-LTO6 is set
to ULTRIUM6C, and LIB1-LTO7 is set to ULTRIUM7C.

When we load media into the library, we have ACSLS automatically move the
cartridges from the CAP into the storage cells. Then, we check in the LTO5
and LTO6 scratch media into the LTO6 library, and LTO7 scratch media into
the LTO7 media.

Unfortunately, it seems that TSM is not completely aware of the media type
of the volumes in the LTO6 library. We've noticed a few instances of when
it's used a LTO6 cartridge for a storage pool using the LTO5 device class,
and vice versa. Oddly, this doesn't even cause any errors.

Have other folks solved this problem and, if so, what have been the fixes?

Thanks!

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: NFS mounts backed up

2016-11-30 Thread Skylar Thompson

Thanks, Eric! Have you had a chance to test this out? My understanding was
that AUTOMOUNT was only needed if you wanted to backup the automounted
filesystems, rather than exclude them.

On Wed, Nov 30, 2016 at 10:08:02AM +, Loon, Eric van (ITOPT3) - KLM wrote:
> Hi all,
> Just for the archives: IBM confirmed that my issue is caused by a TSM BA 
> client design change (7.1.4.1 and higher) documented in APAR IT16782 
> (http://www-01.ibm.com/support/docview.wss?crawler=1=swg1IT16782)
> Kind regards,
> Eric van Loon
> Air France/KLM Storage Engineering
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Skylar Thompson
> Sent: donderdag 17 november 2016 17:36
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: NFS mounts backed up
>
> All of our autofs-managed mounts are in a common /net path, so we just have 
> an "exclude.dir /net" rule on the hosts that shouldn't be backing up NFS.
>
> On Thu, Nov 17, 2016 at 04:28:47PM +, Loon, Eric van (ITOPT3) - KLM wrote:
> > Hi Skylar!
> > This could be the case, but how does one prevent this than?
> > The customer stated that the NFS filesystems weren't backed up before and 
> > the issue started a few weeks ago. They first suspected the upgrade to 
> > 7.1.6, but we just installed the previous version (7.1.4.4) and this 
> > versions backs them up too...
> > Thanks again for your help!
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage Engineering
> >
> >
> > -Original Message-
> > From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf
> > Of Skylar Thompson
> > Sent: donderdag 17 november 2016 16:06
> > To: ADSM-L@VM.MARIST.EDU
> > Subject: Re: NFS mounts backed up
> >
> > Are you using an automounter that provides directory entries for the mount 
> > points before they're mounted (aka "ghost" mounts)? If so, TSM will detect 
> > the directory entries and the automounter can mount the filesystems before 
> > TSM can detect them as NFS.
> >
> > On Thu, Nov 17, 2016 at 02:54:17PM +, Loon, Eric van (ITOPT3) - KLM 
> > wrote:
> > > Hi guys!
> > > We have a host with a TSM client 7.1.6 with several NFS mounts. As far as 
> > > I know TSM should not backup NFS mounts, unless explicitly specified or 
> > > when set through the DOMAIN or INCLUDE statement. On this node neither 
> > > one is used but as soon as we issue a dsmc i without any additional 
> > > parameters the NFS mounts are backed up too.
> > > Any idea what could be causing this? Thanks for any help in advance!
> > > Kind regards,
> > > Eric van Loon
> > > Air France/KLM Storage Engineering
> > >
> > > 
> > > For information, services and offers, please visit our web site: 
> > > http://www.klm.com. This e-mail and any attachment may contain 
> > > confidential and privileged material intended for the addressee only. If 
> > > you are not the addressee, you are notified that no part of the e-mail or 
> > > any attachment may be disclosed, copied or distributed, and that any 
> > > other action related to this e-mail or attachment is strictly prohibited, 
> > > and may be unlawful. If you have received this e-mail by error, please 
> > > notify the sender immediately by return e-mail, and delete this message.
> > >
> > > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> > > employees shall not be liable for the incorrect or incomplete 
> > > transmission of this e-mail or any attachments, nor responsible for any 
> > > delay in receipt.
> > > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
> > > Dutch Airlines) is registered in Amstelveen, The Netherlands, with
> > > registered number 33014286
> > > 
> >
> > --
> > -- Skylar Thompson (skyl...@u.washington.edu)
> > -- Genome Sciences Department, System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- University of Washington School of Medicine
> > 
> > For information, services and offers, please visit our web site: 
> > http://www.klm.com. This e-mail and any attachment may contain confidential 
> > and privileged material intended for the addressee only. If you are not the 
> > addressee, you are notified that no part of the e-mail or any attachment 
> > may be disclosed, copied or

Re: NFS mounts backed up

2016-11-17 Thread Skylar Thompson

All of our autofs-managed mounts are in a common /net path, so we just have
an "exclude.dir /net" rule on the hosts that shouldn't be backing up NFS.

On Thu, Nov 17, 2016 at 04:28:47PM +, Loon, Eric van (ITOPT3) - KLM wrote:
> Hi Skylar!
> This could be the case, but how does one prevent this than?
> The customer stated that the NFS filesystems weren't backed up before and the 
> issue started a few weeks ago. They first suspected the upgrade to 7.1.6, but 
> we just installed the previous version (7.1.4.4) and this versions backs them 
> up too...
> Thanks again for your help!
> Kind regards,
> Eric van Loon
> Air France/KLM Storage Engineering
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Skylar Thompson
> Sent: donderdag 17 november 2016 16:06
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: NFS mounts backed up
>
> Are you using an automounter that provides directory entries for the mount 
> points before they're mounted (aka "ghost" mounts)? If so, TSM will detect 
> the directory entries and the automounter can mount the filesystems before 
> TSM can detect them as NFS.
>
> On Thu, Nov 17, 2016 at 02:54:17PM +, Loon, Eric van (ITOPT3) - KLM wrote:
> > Hi guys!
> > We have a host with a TSM client 7.1.6 with several NFS mounts. As far as I 
> > know TSM should not backup NFS mounts, unless explicitly specified or when 
> > set through the DOMAIN or INCLUDE statement. On this node neither one is 
> > used but as soon as we issue a dsmc i without any additional parameters the 
> > NFS mounts are backed up too.
> > Any idea what could be causing this? Thanks for any help in advance!
> > Kind regards,
> > Eric van Loon
> > Air France/KLM Storage Engineering
> >
> > 
> > For information, services and offers, please visit our web site: 
> > http://www.klm.com. This e-mail and any attachment may contain confidential 
> > and privileged material intended for the addressee only. If you are not the 
> > addressee, you are notified that no part of the e-mail or any attachment 
> > may be disclosed, copied or distributed, and that any other action related 
> > to this e-mail or attachment is strictly prohibited, and may be unlawful. 
> > If you have received this e-mail by error, please notify the sender 
> > immediately by return e-mail, and delete this message.
> >
> > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> > employees shall not be liable for the incorrect or incomplete transmission 
> > of this e-mail or any attachments, nor responsible for any delay in receipt.
> > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal
> > Dutch Airlines) is registered in Amstelveen, The Netherlands, with
> > registered number 33014286
> > 
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 
>   

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: NFS mounts backed up

2016-11-17 Thread Skylar Thompson

Are you using an automounter that provides directory entries for the mount
points before they're mounted (aka "ghost" mounts)? If so, TSM will detect
the directory entries and the automounter can mount the filesystems before
TSM can detect them as NFS.

On Thu, Nov 17, 2016 at 02:54:17PM +, Loon, Eric van (ITOPT3) - KLM wrote:
> Hi guys!
> We have a host with a TSM client 7.1.6 with several NFS mounts. As far as I 
> know TSM should not backup NFS mounts, unless explicitly specified or when 
> set through the DOMAIN or INCLUDE statement. On this node neither one is used 
> but as soon as we issue a dsmc i without any additional parameters the NFS 
> mounts are backed up too.
> Any idea what could be causing this? Thanks for any help in advance!
> Kind regards,
> Eric van Loon
> Air France/KLM Storage Engineering
>
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: TSM Migration Question

2016-09-21 Thread Skylar Thompson

The only oddity I see is that DDSTGPOOL4500 has a NEXTSTGPOOL of TAPEPOOL.
Shouldn't cause any problems now since utilization is 0% but would get
triggered once you hit the HIGHMIG threshold.

Is there anything in the activity log for the errant migration processes?

On Wed, Sep 21, 2016 at 03:28:53PM +, Plair, Ricky wrote:
> OLD STORAGE POOL
>
> tsm: PROD-TSM01-VM>q stg ddstgpool f=d
>
> Storage Pool Name: DDSTGPOOL
> Storage Pool Type: Primary
> Device Class Name: DDFILE
>Estimated Capacity: 402,224 G
>Space Trigger Util: 69.4
>  Pct Util: 70.4
>  Pct Migr: 70.4
>   Pct Logical: 95.9
>  High Mig Pct: 100
>   Low Mig Pct: 95
>   Migration Delay: 0
>Migration Continue: Yes
>   Migration Processes: 26
> Reclamation Processes: 10
> Next Storage Pool: DDSTGPOOL4500
>  Reclaim Storage Pool:
>Maximum Size Threshold: No Limit
>Access: Read/Write
>   Description:
> Overflow Location:
> Cache Migrated Files?:
>Collocate?: No
> Reclamation Threshold: 70
> Offsite Reclamation Limit:
>   Maximum Scratch Volumes Allowed: 3,000
>Number of Scratch Volumes Used: 2,947
> Delay Period for Volume Reuse: 0 Day(s)
>Migration in Progress?: No
>  Amount Migrated (MB): 0.00
>  Elapsed Migration Time (seconds): 4,560
>  Reclamation in Progress?: Yes
>Last Update by (administrator): RPLAIR
> Last Update Date/Time: 09/21/2016 09:05:51
>  Storage Pool Data Format: Native
>  Copy Storage Pool(s):
>   Active Data Pool(s):
>   Continue Copy on Error?: Yes
>  CRC Data: No
>  Reclamation Type: Threshold
>   Overwrite Data when Deleted:
> Deduplicate Data?: No
>  Processes For Identifying Duplicates:
> Duplicate Data Not Stored:
>Auto-copy Mode: Client
> Contains Data Deduplicated by Client?: No
>
>
>
> NEW STORAGE POOL
>
> tsm: PROD-TSM01-VM>q stg ddstgpool4500 f=d
>
> Storage Pool Name: DDSTGPOOL4500
> Storage Pool Type: Primary
> Device Class Name: DDFILE1
>Estimated Capacity: 437,159 G
>Space Trigger Util: 21.4
>  Pct Util: 6.7
>  Pct Migr: 6.7
>   Pct Logical: 100.0
>  High Mig Pct: 90
>   Low Mig Pct: 70
>   Migration Delay: 0
>Migration Continue: Yes
>   Migration Processes: 1
> Reclamation Processes: 1
> Next Storage Pool: TAPEPOOL
>  Reclaim Storage Pool:
>Maximum Size Threshold: No Limit
>Access: Read/Write
>   Description:
> Overflow Location:
> Cache Migrated Files?:
>Collocate?: No
> Reclamation Threshold: 70
> Offsite Reclamation Limit:
>   Maximum Scratch Volumes Allowed: 3,000
>Number of Scratch Volumes Used: 0
> Delay Period for Volume Reuse: 0 Day(s)
>Migration in Progress?: No
>  Amount Migrated (MB): 0.00
>  Elapsed Migration Time (seconds): 0
>  Reclamation in Progress?: No
>Last Update by (administrator): RPLAIR
> Last Update Date/Time: 09/21/2016 08:38:58
>  Storage Pool Data Format: Native
>  Copy Storage Pool(s):
>   Active Data Pool(s):
>   Continue Copy on Error?: Yes
>  CRC Data: No
>  Reclamation Type: Threshold
>   Overwrite Data when Deleted:
> Deduplicate Data?: No
>  Processes For Identifying Duplicates:
> Duplicate Data Not Stored:
>Auto-copy Mode: Client
> Contains Data Deduplicated by Client?: No
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On

Re: TSM Migration Question

2016-09-21 Thread Skylar Thompson

Can you post the output of "Q STG F=D" for each of those pools?

On Wed, Sep 21, 2016 at 02:33:42PM +, Plair, Ricky wrote:
> Within TSM I am migrating an old storage pool on a DD4200 to a new storage 
> pool on a DD4500.
>
> First of all, it worked fine yesterday.
>
> The nextpool is correct and migration is hi=0 lo=0 and using 25 migration 
> process, but I had to stop it.
>
> Now when I restart it the migration process it is migrating to the old 
> storage volumes instead of the new storage volumes. Basically it's just 
> migrating from one disk volume inside the ddstgpool to another disk volume in 
> the ddstgpool.
>
> It is not using the next pool parameter,  has anyone seen this problem before?
>
> I appreciate the help.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Archive interrupted

2016-09-21 Thread Skylar Thompson

We use a script that takes in a set of filesystem paths, gets a list of
files in those paths, queries a given TSM node for already-archived files,
and then generates a list of files that are in the former list but not the
latter. It then passes that final list to "dsmc archive" via -filelist. In
the event of a problem, the files in the active transaction will be lost
but the files already committed are stored safely, and the script will pick
up the archive where it left off.

For the most part, we use RETINIT=EVENT for our archives, so that at the
end we can do SET EVENT and make sure all the files have the same
retention.

Note that you *do* have to be careful on restore/retrieve: if the client is
killed in the middle of a restore/retrieve, you risk having a partial file
on disk, that -REPLACE=YES or -REPLACE=IFNEWER will not replace.

On Wed, Sep 21, 2016 at 08:21:58AM -0400, Zoltan Forray wrote:
> A user was running a large archive and the server was accidentally rebooted.
>
> Am I correct that he must start all over again - there is no appending to
> an existing archive?  I assume the archive that was running is still
> good/viable.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: NDMP backups Restore

2016-09-06 Thread Skylar Thompson

Yep, this is correct. It is one of the reasons I have avoided NDMP backups
in favor of NFS/CIFS backups.

On Tue, Sep 06, 2016 at 03:23:58PM +, Huebner, Andy wrote:
> My understanding is NDMP is a standard for the transport of data, not the 
> format of the data.  I would not expect an NDMP backup from one vendor to 
> work on the next vendor.
>
> As an example: NetApp supports compression and does not decompress for the 
> backup.  They next system would have to understand NetApp compression.
>
> Andy Huebner
> SME - Storage and Backups GDC - Fort Worth
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Remco Post
> Sent: Tuesday, September 06, 2016 2:47 AM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] NDMP backups Restore
>
> > On 05 Sep 2016, at 14:55, Nick Marouf <mar...@gmail.com> wrote:
> >
> > Hello ADSM,
> >
> > We are currently migrating away from Netapps, (NDMP to tape) to a new
> > store subsystem ( Infinidat)
> >
> > My belief is that since NDMP is a common protocol, we should be able
> > to restore those via TSM to volumes on to the Infinidats or any other
> > alternate storage destination.
> >
>
> I think you find that ???computer says no???. I???m guessing that your 
> current NDMP dumps are in NetApp format, so you can possibly restore those to 
> every NetApp you wish, but nothing else.
>
> > Does anyone have any experience or tips? I feel that I'm only
> > scratching the surface with this topics.
> >
> >
> > Thank you
> > -Nick
> >
> >
> >
> > --
> >
> > PGP PUBLIC KEY
> >
> > https://keybase.io/marouf/key.asc
>
> --
>
>  Met vriendelijke groeten/Kind Regards,
>
> Remco Post
> r.p...@plcs.nl
> +31 6 248 21 622

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Which process generates ANR1423W messages?

2016-09-01 Thread Skylar Thompson

I think it's probably DRM-managed volumes, and generated as expiration or
reclamation runs. If you run "Q DRM ", they should show up as
VAULTRETRIEVE. When you bring them onsite, you can run "MOVE DRM
 TOSTATE=ONSITERETRIEVE" and they will immediately become
scratch.

On Thu, Sep 01, 2016 at 06:23:13PM +, Robert Talda wrote:
> Folks:
>   Does anyone know which process generates ANR1423W messages?  The message 
> itself is somewhat innocuous:
>
> ANR1423W Scratch volume VV is empty but will not be deleted - volume 
> access mode is ???offsite???
>
>   but the intriguing part is there is no session or process associated with 
> the message, either in the activity log or in the ACTLOG table.  Nor are 
> there any entries in the summary table for these entities - and the only 
> process running at the time was a storage pool backup for a different storage 
> pool.  There were client backups in progress, but this message originated 
> from the server.
>
>   I had two volumes with errors that I was struggling to get the data off for 
> several days - and suddenly, magically, an ANR1423W appeared for both 
> volumes.  Headache gone, curiosity piqued.
>
> Thanks in advance,
> Bob T
>
>
> Robert Talda
> EZ-Backup Systems Engineer
> Cornell University
> +1 607-255-8280
> r...@cornell.edu
>
>

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Basic select help

2016-08-23 Thread Skylar Thompson

Not quite sure why that query is not working, but LEFT OUTER JOIN works
fine for me:

SELECT n.node_name FROM nodes n LEFT OUTER JOIN filespaces f ON 
n.node_name=f.node_name WHERE f.node_name IS NULL

On Tue, Aug 23, 2016 at 12:58:47PM +, Loon, Eric van (ITOPT3) - KLM wrote:
> Hi David!
> Your query doesn't work either. I was puzzling too with Shawn's SQL query and 
> I don't understand why it isn't working.
> Kind regards,
> Eric van Loon
> Air France/KLM Storage Engineering
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> David Ehresman
> Sent: dinsdag 23 augustus 2016 14:13
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: Basic select help
>
> I think this should do what you want:
>
> select node_name  from nodes where node_name not in (select node_name from 
> filespaces)
>
> David
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Shawn Drew
> Sent: Monday, August 22, 2016 8:00 PM
> To: ADSM-L@VM.MARIST.EDU
> Subject: [ADSM-L] Basic select help
>
> I am trying to get a list of nodes that have no filespaces and I am getting 
> stuck on what seems to be a very basic select statement.  Can someone tell me 
> where I am going wrong?
> The way I understand it, the select should at least show the node I just 
> created with no filespaces.
>
>
> tsm: TSM1500>reg n shawntest  do=admin userid=none ANR2060I Node 
> SHAWNTEST registered in policy domain ADMIN.
>
> tsm: TSM1500>select node_name from nodes where node_name NOT IN (select 
> distinct(node_name) from filespaces) ANR2034E SELECT: No match found using 
> this criteria.
> ANS8001I Return code 11.
> 
> For information, services and offers, please visit our web site: 
> http://www.klm.com. This e-mail and any attachment may contain confidential 
> and privileged material intended for the addressee only. If you are not the 
> addressee, you are notified that no part of the e-mail or any attachment may 
> be disclosed, copied or distributed, and that any other action related to 
> this e-mail or attachment is strictly prohibited, and may be unlawful. If you 
> have received this e-mail by error, please notify the sender immediately by 
> return e-mail, and delete this message.
>
> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> employees shall not be liable for the incorrect or incomplete transmission of 
> this e-mail or any attachments, nor responsible for any delay in receipt.
> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> number 33014286
> 
>   

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Recommendations on IBM LTO4 tapes

2016-07-28 Thread Skylar Thompson

IIRC, LTO cartridges are rated in the hundreds of thousands of passes and
tens of thousands of mounts (you need multiple passes to fill a volume,
depending on the LTO generation).

This means in practice you will never hit the "rated" maximum of the
cartridge. Instead, you will hit problems caused by manufacturing defects
or your particular environment (temperature, humidity, shock, etc.), so you
should be looking for library and I/O errors on a particular cartridge to
decide when to retire it.

On Thu, Jul 28, 2016 at 02:50:34PM +, Lamb, Charles P. wrote:
> Hi...
>
> Does anyone have recommendations on how long an IBM LTO4 tape should be used 
> and the maximum number of tape mounts on an IBM LTO4 tape??

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Restore TSM data with no TSM database.

2016-06-14 Thread Skylar Thompson

Not in any useful way, no. You need the database to (among other things)
determine where each file version lives in the storage hierarchy.

On Tue, Jun 14, 2016 at 03:39:24PM +, Plair, Ricky wrote:
> All,
>
> I'm currently in a DR test,  and the following scenario has raised an ugly 
> question.
>
> We have a production Data Domain replicating to a DR DD.
>
> All the data is backed up to the production DD using TSM.
>
> The question is, can  we restore the TSM data at the DR location from the DD 
> if we don't have a TSM database.
>
> In other word,  if we lost the TSM database,  but had the data that was 
> backed up by a TSM server, is there any way to build a new TSM server and 
> retrieve the data.
>
> I appreciate any help.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA

2016-04-22 Thread Skylar Thompson

You can also GRANT PROXY and then use -ASNODE from one of your own nodes,
using your node's password. I think the general node type has to match
(i.e. any UNIX can proxy to any UNIX, but not Windows).

On Fri, Apr 22, 2016 at 02:20:38PM +, Schneider, Jim wrote:
> Use a server you can access and modify the nodename in the options file, 
> assuming you know the password.
>
> Jim Schneider
> Essendant
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of Lee, 
> Gary
> Sent: Friday, April 22, 2016 9:11 AM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA
>
> Wish I could do that.  This comes from three levels above me in management.
> Trying to buy more storage to sell to departments.
> Don't ask me, I have no clue what they are doing.
>
> I'll look into the q backup on client side, but don't have access to all of 
> them.
>
>
> -Original Message-
> From: ADSM: Dist Stor Manager [mailto:ADSM-L@VM.MARIST.EDU] On Behalf Of 
> Skylar Thompson
> Sent: Friday, April 22, 2016 10:00 AM
> To: ADSM-L@VM.MARIST.EDU
> Subject: Re: [ADSM-L] SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA
>
> If you have access to the clients, you can use QUERY BACKUP and parse the A/I 
> column.
>
> Honestly, though, when we've gotten this query, I've managed to push this 
> back on the customers; it's not TSM's problem what's active or inactive, it's 
> the customers' applications that are actually responsible for it.
> Obviously you need a pretty good relationship with your customers to make 
> that case, but in the end it's caused our customers to think more carefully 
> about workflow in general.
>
> On Fri, Apr 22, 2016 at 01:51:20PM +, Lee, Gary wrote:
> > Just got a request for the amount of active versus inactive data on our tsm 
> > servers.
> >
> > Is there a better way than traversing the backups table and summing?
> > That would be a mighty long query.
> >
> > We have three servers, and approximately 300 clients about 200 tB total 
> > data.
> >
> > Thanks for any suggestions.
>
> --
> -- Skylar Thompson (skyl...@u.washington.edu)
> -- Genome Sciences Department, System Administrator
> -- Foege Building S046, (206)-685-7354
> -- University of Washington School of Medicine
>
> **
> Information contained in this e-mail message and in any attachments thereto 
> is confidential. If you are not the intended recipient, please destroy this 
> message, delete any copies held on your systems, notify the sender 
> immediately, and refrain from using or disclosing all or any part of its 
> content to any other person.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA

2016-04-22 Thread Skylar Thompson

Thanks for reminding me about this, I had completely forgotten. At the
time, the customers wanted per-directory stats so it wasn't an option for
us.

On Fri, Apr 22, 2016 at 10:11:27AM -0400, Zoltan Forray wrote:
> Per this document, IBM's suggestion/recommendation is to perform EXPORT
> NODE  FILED=BACKUPActive PREVIEW=YES
>
> http://www-01.ibm.com/support/docview.wss?uid=swg21267260
>
> On Fri, Apr 22, 2016 at 10:00 AM, Skylar Thompson <skyl...@u.washington.edu>
> wrote:
>
> > If you have access to the clients, you can use QUERY BACKUP and parse the
> > A/I column.
> >
> > Honestly, though, when we've gotten this query, I've managed to push this
> > back on the customers; it's not TSM's problem what's active or inactive,
> > it's the customers' applications that are actually responsible for it.
> > Obviously you need a pretty good relationship with your customers to make
> > that case, but in the end it's caused our customers to think more carefully
> > about workflow in general.
> >
> > On Fri, Apr 22, 2016 at 01:51:20PM +, Lee, Gary wrote:
> > > Just got a request for the amount of active versus inactive data on our
> > tsm servers.
> > >
> > > Is there a better way than traversing the backups table and summing?
> > > That would be a mighty long query.
> > >
> > > We have three servers, and approximately 300 clients about 200 tB total
> > data.
> > >
> > > Thanks for any suggestions.
> >
> > --
> > -- Skylar Thompson (skyl...@u.washington.edu)
> > -- Genome Sciences Department, System Administrator
> > -- Foege Building S046, (206)-685-7354
> > -- University of Washington School of Medicine
> >
>
>
>
> --
> *Zoltan Forray*
> TSM Software & Hardware Administrator
> Xymon Monitor Administrator
> VMware Administrator (in training)
> Virginia Commonwealth University
> UCC/Office of Technology Services
> www.ucc.vcu.edu
> zfor...@vcu.edu - 804-828-4807
> Don't be a phishing victim - VCU and other reputable organizations will
> never use email to request that you reply with your password, social
> security number or confidential personal information. For more details
> visit http://infosecurity.vcu.edu/phishing.html

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: SQL QUERY FOR AMOUNT OF ACTIVE VS INACTIVE DATA

2016-04-22 Thread Skylar Thompson

If you have access to the clients, you can use QUERY BACKUP and parse the
A/I column.

Honestly, though, when we've gotten this query, I've managed to push this
back on the customers; it's not TSM's problem what's active or inactive,
it's the customers' applications that are actually responsible for it.
Obviously you need a pretty good relationship with your customers to make
that case, but in the end it's caused our customers to think more carefully
about workflow in general.

On Fri, Apr 22, 2016 at 01:51:20PM +, Lee, Gary wrote:
> Just got a request for the amount of active versus inactive data on our tsm 
> servers.
>
> Is there a better way than traversing the backups table and summing?
> That would be a mighty long query.
>
> We have three servers, and approximately 300 clients about 200 tB total data.
>
> Thanks for any suggestions.

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: Managed servers out of sync

2016-04-19 Thread Skylar Thompson

The problem is that a domain that has any nodes in it and is out-of-sync
cannot be fixed; I believe that operation is functionally like a
delete It's really no big deal to do the temp domain though:

copy domain dom1 temp-dom1
update node * wheredom=dom1 dom=temp-dom1
(sync up)
update node * wheredom=temp-dom1 dom=dom1
delete domain temp-dom1

The copy domain operation even copies the active policyset and default
management classes, so there's no risk to having the wrong retention policies 
applied.

On Tue, Apr 19, 2016 at 01:39:54PM +, Kamp, Bruce (Ext) wrote:
> I have 4 TSM servers running on AIX the library manager/configuration manger 
> is now 7.1.4.100 (upgraded from 7.1.0) the other 3 are 7.1.0.
> About a month ago the server to server communications stopped working because 
> of authentication failure.  In working with IBM it was decided that I need to 
> upgrade all my servers to a higher version of TSM.  With all the changes 
> going on at the moment it will take a while for me to upgrade the rest of the 
> servers.
> I have figured out a temporary work around to get the communications working 
> until I can upgrade.  What I found out when I "fixed" the first server is 
> that domains have become out of synch.
>
> ANR3350W Locally defined domain FS_PROD_DOMAIN_04 contains
> at least one node and cannot be replaced with a
> definition from the configuration manager. (SESSION: 4)
>
> When I asked IBM how to figure out which nodes are causing this problem I was 
> told I had to move all nodes to a temp domain delete the domain run notify 
> subscribers & than move the nodes back into the domain.
>
> What I am wondering is if anyone knows how I can identify what nodes are 
> causing this so I only have to move them out & back?
>
>
> Thanks,
> Bruce Kamp
> GIS Backup & Recovery
> (817) 568-7331
> e-mail: mailto:bruce.k...@novartis.com
> Out of Office:

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: EoS for TSM 6.3

2016-04-18 Thread Skylar Thompson

Given that the TSM v6.4 "release" shipped the v6.3 server[1], does this
mean that the v6.3 server will continue to be supported until v6.4 is
no longer supported?

[1] http://www-01.ibm.com/support/docview.wss?uid=swg21243309

On Mon, Apr 18, 2016 at 04:56:16PM +0100, Schofield, Neil (Storage & 
Middleware, Backup & Restore) wrote:
> In case anyone missed it, IBM last week announced the End-of-Support date for 
> TSM 6.3 would be April 30th 2017:
> http://www.ibm.com/common/ssi/rep_ca/2/897/ENUS916-072/index.html

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: SQL statement

2016-04-12 Thread Skylar Thompson

If 
> >> you are not the addressee, you are notified that no part of the e-mail or 
> >> any attachment may be disclosed, copied or distributed, and that any other 
> >> action related to this e-mail or attachment is strictly prohibited, and 
> >> may be unlawful. If you have received this e-mail by error, please notify 
> >> the sender immediately by return e-mail, and delete this message.
> >>
> >> Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> >> employees shall not be liable for the incorrect or incomplete transmission 
> >> of this e-mail or any attachments, nor responsible for any delay in 
> >> receipt.
> >> Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> >> Airlines) is registered in Amstelveen, The Netherlands, with registered 
> >> number 33014286
> >> 
> >
> > 
> > For information, services and offers, please visit our web site: 
> > http://www.klm.com. This e-mail and any attachment may contain confidential 
> > and privileged material intended for the addressee only. If you are not the 
> > addressee, you are notified that no part of the e-mail or any attachment 
> > may be disclosed, copied or distributed, and that any other action related 
> > to this e-mail or attachment is strictly prohibited, and may be unlawful. 
> > If you have received this e-mail by error, please notify the sender 
> > immediately by return e-mail, and delete this message.
> >
> > Koninklijke Luchtvaart Maatschappij NV (KLM), its subsidiaries and/or its 
> > employees shall not be liable for the incorrect or incomplete transmission 
> > of this e-mail or any attachments, nor responsible for any delay in receipt.
> > Koninklijke Luchtvaart Maatschappij N.V. (also known as KLM Royal Dutch 
> > Airlines) is registered in Amstelveen, The Netherlands, with registered 
> > number 33014286
> > 
> > 
>

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: TSM RHEL7 Linux server and EMC Isilon

2016-03-28 Thread Skylar Thompson

Never done this myself (only used DAS for the disk/file pools), but FILE
would be my choice too. NFS is decent at sequential I/O (which FILE
guarantees) but is atrocious at random I/O.

On Mon, Mar 28, 2016 at 11:02:21AM -0400, Zoltan Forray wrote:
> We are working on a project to beef up offsite backups using an EMC Isilon
> box attached to a RHEL TSM server.
>
> Anybody doing this kind of configuration?  I have concerns since it will be
> connecting >300TB via NFS mount to use for the TSM storage. I am assuming
> it will be best to define the stgpool as a FILE format?

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

Re: WAN performance

2016-02-18 Thread Skylar Thompson

I thought TCPBUFFSIZE could only go up to 64? It could be that setting it
to 512 actually sets it to the default of 16.

On Thu, Feb 18, 2016 at 02:03:26PM -0500, Tom Alverson wrote:
> I am seeing very poor WAN performance on all of my (wan based) TSM
> backups.  Due to the latency (40 msec typical) I normally only get about
> 20% of the available bandwidth used by a TSM backup.  With EMC Networker I
> get over 90% utilization.  I have already set all of these recommended
> options:
>
> RESOURCEUTILIZATION 2
>
> TXNBYTELIMIT 2097152
>
> TCPNODELAY YES
>
> TCPBUFFSIZE 512
>
> TCPWINDOWSIZE 2048
>
> LARGECOMMBUFFERS YES
>
>
> Does anyone know of anything else that could help performance?  Has anyone
> used a Riverbed accelerator for TSM backups?
>
>
> Tom

--
-- Skylar Thompson (skyl...@u.washington.edu)
-- Genome Sciences Department, System Administrator
-- Foege Building S046, (206)-685-7354
-- University of Washington School of Medicine

1 2 3 4 >

1 - 100 of 389 matches

Mail list logo