Re: urgent! server is down!

2003-03-17 Thread Jozef Zatko
Helo guys,
one of our customer wants to backup Oracle databases in such a way, that
along with normal daily backup with 14 days retension should be also once a
week created sparate full backups of Oracle databases, which should be kept
for 1 year.
My question is how to do this as eficintly as possible (som of databases
are quite large - 400 GB).

Is there any other solution than to use two different node names?
And if I use two node names for Oracle backup, how to configure TDP? Do I
need two RMAN backup repositories - one for each TSM node or can I use only
one?

Thank you in advance

Ing. Jozef Zatko
Login a.s.
Dlha 2, Stupava
tel.: (421) (2) 60252618


Re: urgent! server is down!

2003-03-13 Thread Gagan Singh Rana
Dear *ites

Try to audit the database it might solve ur problem...

rgds
Gagan Singh Rana

What is now proved was once only imagined.


the Business Enterprise Solutions Team

QuantM Systems Pvt. Ltd.
79 Amrit Nagar, NDSE Part I
New Delhi - 110003

Voice : 91-11-4691575/4602217
Fax: 91-11-4691188
Hand Phone : 91-9868091938

DISCLAIMER :
The information in this e-mail is confidential and may be legally privileged. It is
intended SOLELY for the addressee. Access to this e-mail by anyone else is 
unauthorized.
If you are NOT the intended recipient, any disclosure, copying, distribution or any 
action
taken on it is prohibited and may be unlawful. Any opinions or advice contained in this
e-mail are subject to the terms and conditions expressed in the governing client
relationship engagement letter.
__
Visit us at www.quantm.com



- Original Message -
From: Richard Sims [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 5:53 PM
Subject: Re: urgent! server is down!


 After a reboot yesterday tsm doesnt start. ...
 ...
 ANR0900I Processing options file dsmserv.opt.
 ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
 ANR0990I Server restart-recovery in progress.
 ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
 changed; old capacity 983040 - new capacity 999424.
 ...
  anyway I've never had to restore a db before. How do I go at it?

 I would take a deep breath and stand back and think about that situation
 first...  There's no good reason for a server to be running fine and
 then fail to restart.  Did someone change something??  It's known as a
 time bomb, as when someone made a change perhaps weeks ago while the
 server was running, external to the server, and then when the server
 goes to restart and thus re-read config files and re-open files
 according to their names (rather than prevailing inode usage), it
 can't get anywhere.  The locale message really makes me wonder about
 that: looks like someone changed the startup environment and its LANG
 variable from en_US to C.  Or wacky/changed start-up scripts are being
 used.  Consider the directory where you're sitting when you start the
 server, and the viability of the start-up script.  Examine the timestamps
 and contents of your dsmserv.opt and /var/adsmserv/ files to see if
 someone has monkeyed with things.  Review your site system change log
 to see if perhaps someone on the AIX side of things made an environmental
 change that could have affected your server.

 Remember - restarting the server correctly is more important than
 restarting it quickly.  I would not even think of approaching a server
 db restore until you've gotten to the bottom of what happened to the
 structure of your environment, as such a restoral would only try to
 restore into the possibly faulty environment.

   Richard Sims, BU




urgent! server is down!

2003-03-12 Thread Michelle Wiedeman
hi all,

first things firts, aix 4.3 tsm 4.1
After a reboot yesterday tsm doesnt start. A collegue of mine has been at it
till late last night.
Now I have to pick it up. My guess is the database is corrupt. but the
messages also come with other errors.
could anyone help me out? anyway I've never had to restore a db before. How
do I go at it?
below are the errors

thnx heeps!
michelle

ANR0900I Processing options file dsmserv.opt.
ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
ANR0990I Server restart-recovery in progress.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed;
old capacity 983040 - new capacity 999424.
ANRD lvminit.c(1628): Unable to add disk /dev/rtsmvglv11 to disk table.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed;
old capacity 983040 - new capacity 999424.
ANRD lvminit.c(1628): Unable to add disk /dev/rtsmvglv11 to disk table.
ANR0259E Unable to read complete restart/checkpoint information from any
database or recovery log volume.


Re: urgent! server is down!

2003-03-12 Thread Richard Sims
After a reboot yesterday tsm doesnt start. ...
...
ANR0900I Processing options file dsmserv.opt.
ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
ANR0990I Server restart-recovery in progress.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed; old capacity 983040 - new capacity 999424.
...
 anyway I've never had to restore a db before. How do I go at it?

I would take a deep breath and stand back and think about that situation
first...  There's no good reason for a server to be running fine and
then fail to restart.  Did someone change something??  It's known as a
time bomb, as when someone made a change perhaps weeks ago while the
server was running, external to the server, and then when the server
goes to restart and thus re-read config files and re-open files
according to their names (rather than prevailing inode usage), it
can't get anywhere.  The locale message really makes me wonder about
that: looks like someone changed the startup environment and its LANG
variable from en_US to C.  Or wacky/changed start-up scripts are being
used.  Consider the directory where you're sitting when you start the
server, and the viability of the start-up script.  Examine the timestamps
and contents of your dsmserv.opt and /var/adsmserv/ files to see if
someone has monkeyed with things.  Review your site system change log
to see if perhaps someone on the AIX side of things made an environmental
change that could have affected your server.

Remember - restarting the server correctly is more important than
restarting it quickly.  I would not even think of approaching a server
db restore until you've gotten to the bottom of what happened to the
structure of your environment, as such a restoral would only try to
restore into the possibly faulty environment.

  Richard Sims, BU


Re: urgent! server is down!

2003-03-12 Thread Cook, Dwight E
What was on /dev/rtsmvglv11  ?
By the errors you are seeing, I'd guess either a data base or a log volume.
does errpt show problems on the physical volume(s) that are tsmvglv11 ?


Dwight



-Original Message-
From: Michelle Wiedeman [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 2:53 AM
To: [EMAIL PROTECTED]
Subject: urgent! server is down!
Importance: High

hi all,
first things firts, aix 4.3 tsm 4.1
After a reboot yesterday tsm doesnt start. A collegue of mine has been at it
till late last night.
Now I have to pick it up. My guess is the database is corrupt. but the
messages also come with other errors.
could anyone help me out? anyway I've never had to restore a db before. How
do I go at it? below are the errors

thnx heeps!
michelle

ANR0900I Processing options file dsmserv.opt.
ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
ANR0990I Server restart-recovery in progress.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed;
old capacity 983040 - new capacity 999424.
ANRD lvminit.c(1628): Unable to add disk /dev/rtsmvglv11 to disk table.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed;
old capacity 983040 - new capacity 999424.
ANRD lvminit.c(1628): Unable to add disk /dev/rtsmvglv11 to disk table.
ANR0259E Unable to read complete restart/checkpoint information from any
database or recovery log volume.


Re: urgent! server is down!

2003-03-12 Thread Richard Sims
After a reboot yesterday tsm doesnt start. ...
...
ANR0900I Processing options file dsmserv.opt.
ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
ANR0990I Server restart-recovery in progress.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed; old capacity 983040 - new capacity 999424.
...
 anyway I've never had to restore a db before. How do I go at it?

I would take a deep breath and stand back and think about that situation
first...  There's no good reason for a server to be running fine and
then fail to restart.  Did someone change something??  It's known as a
time bomb, as when someone made a change perhaps weeks ago while the
server was running, external to the server, and then when the server
goes to restart and thus re-read config files and re-open files
according to their names (rather than prevailing inode usage), it
can't get anywhere.  The locale message really makes me wonder about
that: looks like someone changed the startup environment and its LANG
variable from en_US to C.  Or wacky/changed start-up scripts are being
used.  Consider the directory where you're sitting when you start the
server, and the viability of the start-up script.  Examine the timestamps
and contents of your dsmserv.opt and /var/adsmserv/ files to see if
someone has monkeyed with things.  Review your site system change log
to see if perhaps someone on the AIX side of things made an environmental
change that could have affected your server.

Remember - restarting the server correctly is more important than
restarting it quickly.  I would not even think of approaching a server
db restore until you've gotten to the bottom of what happened to the
structure of your environment, as such a restoral would only try to
restore into the possibly faulty environment.

If /dev/rtsmvglv11 indicates, as I think it does, that it is
Raw TSM Volume Group, Logical Volume 11, then the disk may have been
written over by someone else.  A conspicuous problem with using raw
logical volumes is that the absence of a file system can lead novices
to believe that nothing is there and, seeing lots of empty space, they
go and use it.  And if you don't identify whodunit, they could undo any
remedial efforts you attempt in re-doing whatever they did.

  Richard Sims, BU


Re: urgent! server is down!

2003-03-12 Thread Zlatko Krastev/ACIT
The explanation is pretty simple:
1. TSM is installed using raw Logical Volume (LV).
2. I would guess /dev/rtsmvglv11 is DB or Log volume (if it was diskpool
TSM ought to start and have the volume offline).
3. Someone have enlarged the logical volume using chlv command in AIX -
look at the difference between new and old capacity (999424 - 983040 =
16384).
4. Richard might be correct that happened long before restart and you
cannot identify who and when did it.

How to cure this:
A.(supported) restore from last DB backup. Look in volhistory file -
default is /usr/tivoli/tsm/server/bin/volhist.out, non-default is
specified by VOLUMEHistory option of dsmserv.opt. Later use dsmserv
restore db (described in Administrator's Reference, Appendix A)

B. (unsupported and with no guaranteed success)
- create new logical volume using AIX command mklv (count carefully its
size - ought to be as before change; use output of lslv tsmvglv11 as
guidance)
- copy the contents using AIX command dd if=/dev/rtsmvglv11 of=/dev/rnew
LV name
- rename old LV (I prefer not to delete it) using AIX command chlv -n
temporary name tsmvglv11
- rename new LV with chlv -n tsmvglv11 new LV name
- try to start TSM server
- on success report to the list ASAP :-)

If you can afford some lost backups option A is the correct one. If you
have some time try option B first and on failure continue with A.

Zlatko Krastev
IT Consultant






Michelle Wiedeman [EMAIL PROTECTED]
Sent by: ADSM: Dist Stor Manager [EMAIL PROTECTED]
12.03.2003 10:52
Please respond to ADSM: Dist Stor Manager


To: [EMAIL PROTECTED]
cc:
Subject:urgent! server is down!


hi all,

first things firts, aix 4.3 tsm 4.1
After a reboot yesterday tsm doesnt start. A collegue of mine has been at
it
till late last night.
Now I have to pick it up. My guess is the database is corrupt. but the
messages also come with other errors.
could anyone help me out? anyway I've never had to restore a db before.
How
do I go at it?
below are the errors

thnx heeps!
michelle

ANR0900I Processing options file dsmserv.opt.
ANR000W Unable to open default locale message catalog,
/usr/lib/nls/msg/C/.
ANR0990I Server restart-recovery in progress.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed;
old capacity 983040 - new capacity 999424.
ANRD lvminit.c(1628): Unable to add disk /dev/rtsmvglv11 to disk
table.
ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
changed;
old capacity 983040 - new capacity 999424.
ANRD lvminit.c(1628): Unable to add disk /dev/rtsmvglv11 to disk
table.
ANR0259E Unable to read complete restart/checkpoint information from any
database or recovery log volume.


Re: urgent! server is down!

2003-03-12 Thread Dan Foster
Hot Diggety! Richard Sims was rumored to have written:
 After a reboot yesterday tsm doesnt start. ...
 ...
 ANR0900I Processing options file dsmserv.opt.
 ANR000W Unable to open default locale message catalog, /usr/lib/nls/msg/C/.
 ANR0990I Server restart-recovery in progress.
 ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
 changed; old capacity 983040 - new capacity 999424.
 ...
 I would take a deep breath and stand back and think about that situation
 first...  There's no good reason for a server to be running fine and
[...]

I agree with what Richard had to say. Taking a deep breath is always step
#1 for handling a crisis without making it worse.

999424 - 983040 = 16384, which is exactly 16 MB and sounds suspiciously
like the PP size. 'rtsm...' sounds like a raw LV rather than a filesystem.

Perhaps someone with root access had done this at some point earlier:

# extendlv tsmvglv11 1

[or had done the equivalent in SMIT.]

(DO NOT EXECUTE THE ABOVE COMMAND! I am only theorizing what may have
happened)

As for eng_US vs C, do this:

# grep LANG /etc/environment

If it says LANG=C then try:

1. Changing it to LANG=en_US in /etc/environment
2. At the root prompt: # export LANG=en_US
3. Try starting up TSM now

And you will probably want to ask your operations staff if anyone had
increased the LV's allocation, perhaps by one physical partition with
extendlv or similar. If someone had done it, I'd have made them put Humpty
Dumpty back together as a great learning experience ;) Tell people to *NOT*
mess around with the TSM server if they do not know what they're doing.

I did a quick test with TSM 5.1 by creating a small 16 MB DB logical
volume (1 PP), started up server OK. Then I did 'extendlv tsmdblv 1',
halted server, and tried to start it up again. I got the exact same
errors you got.

I suspect you may have to remove that LV, recreate it with the expected
size that TSM wants, then do a DB restore from your most recent full db
backup tape.

But before you do that, you'll want to save a copy of your current device
config and volume history file if you have these, as well as your
dsmserv.opt file. Then look in the TSM 5.1 Server for AIX Administrator's
guide at:

http://publibfp.boulder.ibm.com/epubs/pdf/c3207680.pdf

(This is assuming you use TSM 5.1 for AIX; if you use another version,
you'll want to consult that guide instead, but the steps will probably
be similar or still exactly the same.)

DB restore is covered in Chapter 22. 'Restoring a Database to its Most
Current State' at bottom of page 524 is probably your easiest option since
it sounds like you have everything else intact -- volume history info,
logvols, stgpool vols, etc.

Then you'll have to delete (with 'rmlv -y tsmvglv11') the offending LV,
and recreate it (with 'mklv -y vg tsmvglv11 number of PPs'). Then...

Find out which tape has the most recent full DB backup, then do:

# cd /usr/tivoli/tsm/server/bin
# ./dsmserv restore db devclass=whatever vol=tape volser

If that command worked (it's a preview, basically), then do:

# ./dsmserv restore db devclass=whatever vol=tape volser commit=yes

...which will make the restore actually happen, for real.

The actual restore operation is no big deal if you have a good and recent
db backup tape, and know which tape it is. I did this as part of testing
recently, and it worked right off the bat with no problems at all.

If you don't know which tape volser has the latest full db backup, then
you could look into your volume history file. For example, with my setup:

backup1:/usr/tivoli/tsm/server/bin# grep BACKUPFULL volhist.cfg
 2003/02/21 14:41:24  BACKUPFULL  5  0  1
3584_DEVCLASS1 ROC010
 2003/03/01 21:06:51  BACKUPFULL  6  0  1
3584_DEVCLASS1 ROC012

(The file might be called 'volhistory.cfg'; I had explicitly defined mine
to be 'volhist.cfg' at server installation time.)

I have two full db backup tapes... one was done on 2/21, is version 5.
The more recent was was done on 3/1, version 6. So I'd restore ROC012,
for example.

I should warn you that any backups done after the date/time of the most
recent DB backup will effectively be lost, so be sure you really do want to
restore the DB before committing to it. Doesn't sound like you have too
much of a choice in this particular case.

Note for the discerning ADSM-L reader: my production server has daily db backups! The 
above was from a test box.

-Dan


Re: urgent! server is down!

2003-03-12 Thread Michelle Wiedeman
hi all,
It took a while to answer you all, since the whole company seems to be
imploding a the moment.

Ofcourse noone has done anything on the server :| there has been an
enlarging /var, but this is in rootvg and not in the vg where there are any
tsm volumes.

Well, the volume in question turns out to be a db volume, it resides in a
separeate volumegroup (tsmvg) and has a mirror /dev/rtsmvglv12.

Since tsm says the volume has changed in size, it is useless i guess to try
using the mirror.
Te suggestion that the difference in size the server displays in error is
16mb exactly doesnt go, the pp size of this vg is 64MB.

I'm gonna try and restore the lv and the db and see how it goes,

I'll let you now!!

thnx a lot everyone! :o*
michelle



-Original Message-
From: Dan Foster [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 2:41 PM
To: [EMAIL PROTECTED]
Subject: Re: urgent! server is down!


Hot Diggety! Richard Sims was rumored to have written:
 After a reboot yesterday tsm doesnt start. ...
 ...
 ANR0900I Processing options file dsmserv.opt.
 ANR000W Unable to open default locale message catalog,
/usr/lib/nls/msg/C/.
 ANR0990I Server restart-recovery in progress.
 ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
 changed; old capacity 983040 - new capacity 999424.
 ...
 I would take a deep breath and stand back and think about that situation
 first...  There's no good reason for a server to be running fine and
[...]

I agree with what Richard had to say. Taking a deep breath is always step
#1 for handling a crisis without making it worse.

999424 - 983040 = 16384, which is exactly 16 MB and sounds suspiciously
like the PP size. 'rtsm...' sounds like a raw LV rather than a filesystem.

Perhaps someone with root access had done this at some point earlier:

# extendlv tsmvglv11 1

[or had done the equivalent in SMIT.]

(DO NOT EXECUTE THE ABOVE COMMAND! I am only theorizing what may have
happened)

As for eng_US vs C, do this:

# grep LANG /etc/environment

If it says LANG=C then try:

1. Changing it to LANG=en_US in /etc/environment
2. At the root prompt: # export LANG=en_US
3. Try starting up TSM now

And you will probably want to ask your operations staff if anyone had
increased the LV's allocation, perhaps by one physical partition with
extendlv or similar. If someone had done it, I'd have made them put Humpty
Dumpty back together as a great learning experience ;) Tell people to *NOT*
mess around with the TSM server if they do not know what they're doing.

I did a quick test with TSM 5.1 by creating a small 16 MB DB logical
volume (1 PP), started up server OK. Then I did 'extendlv tsmdblv 1',
halted server, and tried to start it up again. I got the exact same
errors you got.

I suspect you may have to remove that LV, recreate it with the expected
size that TSM wants, then do a DB restore from your most recent full db
backup tape.

But before you do that, you'll want to save a copy of your current device
config and volume history file if you have these, as well as your
dsmserv.opt file. Then look in the TSM 5.1 Server for AIX Administrator's
guide at:

http://publibfp.boulder.ibm.com/epubs/pdf/c3207680.pdf

(This is assuming you use TSM 5.1 for AIX; if you use another version,
you'll want to consult that guide instead, but the steps will probably
be similar or still exactly the same.)

DB restore is covered in Chapter 22. 'Restoring a Database to its Most
Current State' at bottom of page 524 is probably your easiest option since
it sounds like you have everything else intact -- volume history info,
logvols, stgpool vols, etc.

Then you'll have to delete (with 'rmlv -y tsmvglv11') the offending LV,
and recreate it (with 'mklv -y vg tsmvglv11 number of PPs'). Then...

Find out which tape has the most recent full DB backup, then do:

# cd /usr/tivoli/tsm/server/bin
# ./dsmserv restore db devclass=whatever vol=tape volser

If that command worked (it's a preview, basically), then do:

# ./dsmserv restore db devclass=whatever vol=tape volser commit=yes

...which will make the restore actually happen, for real.

The actual restore operation is no big deal if you have a good and recent
db backup tape, and know which tape it is. I did this as part of testing
recently, and it worked right off the bat with no problems at all.

If you don't know which tape volser has the latest full db backup, then
you could look into your volume history file. For example, with my setup:

backup1:/usr/tivoli/tsm/server/bin# grep BACKUPFULL volhist.cfg
 2003/02/21 14:41:24  BACKUPFULL  5  0  1
3584_DEVCLASS1 ROC010
 2003/03/01 21:06:51  BACKUPFULL  6  0  1
3584_DEVCLASS1 ROC012

(The file might be called 'volhistory.cfg'; I had explicitly defined mine
to be 'volhist.cfg' at server installation time.)

I have two full db backup tapes... one was done on 2/21, is version 5.
The more recent was was done on 3/1, version 6. So I'd restore

Re: urgent! server is down!

2003-03-12 Thread PINNI, BALANAND (SBCSI)
Its really confusing since lv can not change without updating lv with pps.

If u want to take risk why don't u do synclvodm  cmd to sync ur odm with lv.

-Original Message-
From: Michelle Wiedeman [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 9:53 AM
To: [EMAIL PROTECTED]
Subject: Re: urgent! server is down!

hi all,
It took a while to answer you all, since the whole company seems to be
imploding a the moment.

Ofcourse noone has done anything on the server :| there has been an
enlarging /var, but this is in rootvg and not in the vg where there are any
tsm volumes.

Well, the volume in question turns out to be a db volume, it resides in a
separeate volumegroup (tsmvg) and has a mirror /dev/rtsmvglv12.

Since tsm says the volume has changed in size, it is useless i guess to try
using the mirror.
Te suggestion that the difference in size the server displays in error is
16mb exactly doesnt go, the pp size of this vg is 64MB.

I'm gonna try and restore the lv and the db and see how it goes,

I'll let you now!!

thnx a lot everyone! :o*
michelle



-Original Message-
From: Dan Foster [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 2:41 PM
To: [EMAIL PROTECTED]
Subject: Re: urgent! server is down!


Hot Diggety! Richard Sims was rumored to have written:
 After a reboot yesterday tsm doesnt start. ...
 ...
 ANR0900I Processing options file dsmserv.opt.
 ANR000W Unable to open default locale message catalog,
/usr/lib/nls/msg/C/.
 ANR0990I Server restart-recovery in progress.
 ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
 changed; old capacity 983040 - new capacity 999424.
 ...
 I would take a deep breath and stand back and think about that situation
 first...  There's no good reason for a server to be running fine and
[...]

I agree with what Richard had to say. Taking a deep breath is always step
#1 for handling a crisis without making it worse.

999424 - 983040 = 16384, which is exactly 16 MB and sounds suspiciously
like the PP size. 'rtsm...' sounds like a raw LV rather than a filesystem.

Perhaps someone with root access had done this at some point earlier:

# extendlv tsmvglv11 1

[or had done the equivalent in SMIT.]

(DO NOT EXECUTE THE ABOVE COMMAND! I am only theorizing what may have
happened)

As for eng_US vs C, do this:

# grep LANG /etc/environment

If it says LANG=C then try:

1. Changing it to LANG=en_US in /etc/environment
2. At the root prompt: # export LANG=en_US
3. Try starting up TSM now

And you will probably want to ask your operations staff if anyone had
increased the LV's allocation, perhaps by one physical partition with
extendlv or similar. If someone had done it, I'd have made them put Humpty
Dumpty back together as a great learning experience ;) Tell people to *NOT*
mess around with the TSM server if they do not know what they're doing.

I did a quick test with TSM 5.1 by creating a small 16 MB DB logical
volume (1 PP), started up server OK. Then I did 'extendlv tsmdblv 1',
halted server, and tried to start it up again. I got the exact same
errors you got.

I suspect you may have to remove that LV, recreate it with the expected
size that TSM wants, then do a DB restore from your most recent full db
backup tape.

But before you do that, you'll want to save a copy of your current device
config and volume history file if you have these, as well as your
dsmserv.opt file. Then look in the TSM 5.1 Server for AIX Administrator's
guide at:

http://publibfp.boulder.ibm.com/epubs/pdf/c3207680.pdf

(This is assuming you use TSM 5.1 for AIX; if you use another version,
you'll want to consult that guide instead, but the steps will probably
be similar or still exactly the same.)

DB restore is covered in Chapter 22. 'Restoring a Database to its Most
Current State' at bottom of page 524 is probably your easiest option since
it sounds like you have everything else intact -- volume history info,
logvols, stgpool vols, etc.

Then you'll have to delete (with 'rmlv -y tsmvglv11') the offending LV,
and recreate it (with 'mklv -y vg tsmvglv11 number of PPs'). Then...

Find out which tape has the most recent full DB backup, then do:

# cd /usr/tivoli/tsm/server/bin
# ./dsmserv restore db devclass=whatever vol=tape volser

If that command worked (it's a preview, basically), then do:

# ./dsmserv restore db devclass=whatever vol=tape volser commit=yes

...which will make the restore actually happen, for real.

The actual restore operation is no big deal if you have a good and recent
db backup tape, and know which tape it is. I did this as part of testing
recently, and it worked right off the bat with no problems at all.

If you don't know which tape volser has the latest full db backup, then
you could look into your volume history file. For example, with my setup:

backup1:/usr/tivoli/tsm/server/bin# grep BACKUPFULL volhist.cfg
 2003/02/21 14:41:24  BACKUPFULL  5  0  1
3584_DEVCLASS1 ROC010
 2003/03/01 21:06:51

Re: urgent! server is down!

2003-03-12 Thread Cook, Dwight E
YOU MIGHT be able to take the /dev/rtsmvglv11 out of your dsmserv.dsk file
and try starting tsm.
In seeing your ...11 first, and seeing it ~messed up~ TSM is more than
likely going to simply state
I'm broke, I'm going down...
and letting you deal with how you want to fix it...

try removing the seemingly broken file from the dsmserv.dsk file and try to
start TSM...

Dwight



-Original Message-
From: Michelle Wiedeman [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 9:53 AM
To: [EMAIL PROTECTED]
Subject: Re: urgent! server is down!


hi all,
It took a while to answer you all, since the whole company seems to be
imploding a the moment.

Ofcourse noone has done anything on the server :| there has been an
enlarging /var, but this is in rootvg and not in the vg where there are any
tsm volumes.

Well, the volume in question turns out to be a db volume, it resides in a
separeate volumegroup (tsmvg) and has a mirror /dev/rtsmvglv12.

Since tsm says the volume has changed in size, it is useless i guess to try
using the mirror.
Te suggestion that the difference in size the server displays in error is
16mb exactly doesnt go, the pp size of this vg is 64MB.

I'm gonna try and restore the lv and the db and see how it goes,

I'll let you now!!

thnx a lot everyone! :o*
michelle



-Original Message-
From: Dan Foster [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 2:41 PM
To: [EMAIL PROTECTED]
Subject: Re: urgent! server is down!


Hot Diggety! Richard Sims was rumored to have written:
 After a reboot yesterday tsm doesnt start. ...
 ...
 ANR0900I Processing options file dsmserv.opt.
 ANR000W Unable to open default locale message catalog,
/usr/lib/nls/msg/C/.
 ANR0990I Server restart-recovery in progress.
 ANRD lvminit.c(1872): The capacity of disk '/dev/rtsmvglv11' has
 changed; old capacity 983040 - new capacity 999424.
 ...
 I would take a deep breath and stand back and think about that situation
 first...  There's no good reason for a server to be running fine and
[...]

I agree with what Richard had to say. Taking a deep breath is always step
#1 for handling a crisis without making it worse.

999424 - 983040 = 16384, which is exactly 16 MB and sounds suspiciously
like the PP size. 'rtsm...' sounds like a raw LV rather than a filesystem.

Perhaps someone with root access had done this at some point earlier:

# extendlv tsmvglv11 1

[or had done the equivalent in SMIT.]

(DO NOT EXECUTE THE ABOVE COMMAND! I am only theorizing what may have
happened)

As for eng_US vs C, do this:

# grep LANG /etc/environment

If it says LANG=C then try:

1. Changing it to LANG=en_US in /etc/environment
2. At the root prompt: # export LANG=en_US
3. Try starting up TSM now

And you will probably want to ask your operations staff if anyone had
increased the LV's allocation, perhaps by one physical partition with
extendlv or similar. If someone had done it, I'd have made them put Humpty
Dumpty back together as a great learning experience ;) Tell people to *NOT*
mess around with the TSM server if they do not know what they're doing.

I did a quick test with TSM 5.1 by creating a small 16 MB DB logical
volume (1 PP), started up server OK. Then I did 'extendlv tsmdblv 1',
halted server, and tried to start it up again. I got the exact same
errors you got.

I suspect you may have to remove that LV, recreate it with the expected
size that TSM wants, then do a DB restore from your most recent full db
backup tape.

But before you do that, you'll want to save a copy of your current device
config and volume history file if you have these, as well as your
dsmserv.opt file. Then look in the TSM 5.1 Server for AIX Administrator's
guide at:

http://publibfp.boulder.ibm.com/epubs/pdf/c3207680.pdf

(This is assuming you use TSM 5.1 for AIX; if you use another version,
you'll want to consult that guide instead, but the steps will probably
be similar or still exactly the same.)

DB restore is covered in Chapter 22. 'Restoring a Database to its Most
Current State' at bottom of page 524 is probably your easiest option since
it sounds like you have everything else intact -- volume history info,
logvols, stgpool vols, etc.

Then you'll have to delete (with 'rmlv -y tsmvglv11') the offending LV,
and recreate it (with 'mklv -y vg tsmvglv11 number of PPs'). Then...

Find out which tape has the most recent full DB backup, then do:

# cd /usr/tivoli/tsm/server/bin
# ./dsmserv restore db devclass=whatever vol=tape volser

If that command worked (it's a preview, basically), then do:

# ./dsmserv restore db devclass=whatever vol=tape volser commit=yes

...which will make the restore actually happen, for real.

The actual restore operation is no big deal if you have a good and recent
db backup tape, and know which tape it is. I did this as part of testing
recently, and it worked right off the bat with no problems at all.

If you don't know which tape volser has the latest full db backup, then
you could look into your

Re: urgent! server is down!

2003-03-12 Thread Zlatko Krastev/ACIT
Michelle,

it *is* one PP! Actually TSM measures DB in pages of 4kB. Thus 16384 pages
is exactly 64 MB == 1 PP.
If you have TSM mirrored volume just rename the LV and start TSM. It fill
not find the LV, use the mirror and that is.

Zlatko Krastev
IT Consultant






Michelle Wiedeman [EMAIL PROTECTED]
Sent by: ADSM: Dist Stor Manager [EMAIL PROTECTED]
12.03.2003 17:53
Please respond to ADSM: Dist Stor Manager


To: [EMAIL PROTECTED]
cc:
Subject:Re: urgent! server is down!


...

Te suggestion that the difference in size the server displays in error is
16mb exactly doesnt go, the pp size of this vg is 64MB.

I'm gonna try and restore the lv and the db and see how it goes,

I'll let you now!!

thnx a lot everyone! :o*
michelle

...


Re: urgent! server is down!

2003-03-12 Thread Michelle Wiedeman
of course!!!
I was thinking in aix terms!
As you see, I've been thinking in difficult ways! u tend to forget the easy
stuff and skip to the difficult stuff after a while.
anyway it worked! (the renaming of the lv and starting tsm with the mirror)
:D:D:D

everyone thanks a whole big lot
a big kiss for u all!
michelle


-Original Message-
From: Zlatko Krastev/ACIT [mailto:[EMAIL PROTECTED]
Sent: Wednesday, March 12, 2003 5:25 PM
To: ADSM: Dist Stor Manager
Cc: Michelle Wiedeman
Subject: Re: urgent! server is down!


Michelle,

it *is* one PP! Actually TSM measures DB in pages of 4kB. Thus 16384 pages
is exactly 64 MB == 1 PP.
If you have TSM mirrored volume just rename the LV and start TSM. It fill
not find the LV, use the mirror and that is.

Zlatko Krastev
IT Consultant






Michelle Wiedeman [EMAIL PROTECTED]
Sent by: ADSM: Dist Stor Manager [EMAIL PROTECTED]
12.03.2003 17:53
Please respond to ADSM: Dist Stor Manager


To: [EMAIL PROTECTED]
cc:
Subject:Re: urgent! server is down!


...

Te suggestion that the difference in size the server displays in error is
16mb exactly doesnt go, the pp size of this vg is 64MB.

I'm gonna try and restore the lv and the db and see how it goes,

I'll let you now!!

thnx a lot everyone! :o*
michelle

...


Re: urgent! server is down!

2003-03-12 Thread Henk ten Have
On 12-Mar-03 Michelle Wiedeman wrote:
 a big kiss for u all!

  Hmm...I'm to late now with a good advise I guess;-)

  Cheers,
  Henk ten Have