Thanks for these updates. We were looking to upgrade from v6.3.4 to v6.3.5 shortly, but will hold off for now. Please keep us posted with your progress.
Steve DeGroat Sr Solution Architect for Storage Design Services and Quality Assurance Yale University 203.436.4540 "If you build it, they will come." -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Rhodes, Richard L. Sent: Monday, February 16, 2015 8:01 AM To: [email protected] Subject: Re: [ADSM-L] FW: v6.3.5 hung db2?? Well, I thought we had this resolved. Yesterday (Sunday) we had another crash of this TSM instance. I've opened another PMR. We had 2 more instances scheduled to upgraded to v6.3.5 today that are now postponed indefinitely. Rick -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Mitchell, Ruth Slovik Sent: Friday, February 13, 2015 3:13 PM To: [email protected] Subject: Re: FW: v6.3.5 hung db2?? Rick, Thank you for letting us know about this. It would be interesting to know if related messages were captured in the db2diag.log when this started to manifest itself. Best, Ruth U of I, Urbana, IL -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Rhodes, Richard L. Sent: Friday, February 13, 2015 1:38 PM To: [email protected] Subject: Re: [ADSM-L] FW: v6.3.5 hung db2?? Working with some good support folks! Looks like we hit this: https://urldefense.proofpoint.com/v2/url?u=http-3A__www-2D01.ibm.com_support_docview.wss-3Fcrawler-3D1-26uid-3Dswg1IT06126&d=AwIFAg&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=cU4lgHg-mogg3FJ7Okdd3I2i9Cl4aPnV7nm0FbEjOWY&m=jaMir-5Mj0MJ5eKdK-8UTNsNe9iNYDzZQiuTa22XCgQ&s=kcZ7t5IEvD_0y2FaFO7NGVEkbZ2fWTs2cziJ2lcxipk&e= The v6.3.5 and v7.1.0 caused a bug in the rc.dsmserv startup script. The result is that db2 was running on limited memory - 32MB in our case. This was the default value in /etc/security/limits. Lvl 2 had me change /etc/security/limits default to unlimited memory. Lvl 1 had this above APAR and I fixed the rc.dsmserv script per the instructions. So it looks like our problems were caused by very low db2 memory. I believe it was restricted to 32mb! Rick -----Original Message----- From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf Of Rainer Tammer Sent: Friday, February 13, 2015 11:53 AM To: [email protected] Subject: Re: FW: v6.3.5 hung db2?? Hello, please keep us posted. I will have to go from 6.3.4-300 to a higher version because of the NDMP dump > 2TB overwrite problem... Bye Rainer On 13.02.2015 17:05, Rhodes, Richard L. wrote: > Yea. I opened a Sev 1. > > Thanks! > > Rick > > > > -----Original Message----- > From: ADSM: Dist Stor Manager [mailto:[email protected]] On Behalf > Of Andrew Raibeck > Sent: Friday, February 13, 2015 10:57 AM > To: [email protected] > Subject: Re: FW: v6.3.5 hung db2?? > > Hi Rick, > > Off-hand I am not sure what the problem is, I think it would be a good > idea to open a PMR if you have not already done so. > > Best regards, > > - Andy > > ______________________________________________________________________ > ______ > > Andrew Raibeck | Tivoli Storage Manager Level 3 Technical Lead | > [email protected] > > IBM Tivoli Storage Manager links: > Product support: > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ibm.com_suppor > t_entry_portal_Overview_Software_Tivoli_Tivol&d=AwIFAg&c=-dg2m7zWuuDZ0 > MUcV7Sdqw&r=cU4lgHg-mogg3FJ7Okdd3I2i9Cl4aPnV7nm0FbEjOWY&m=jaMir-5Mj0MJ > 5eKdK-8UTNsNe9iNYDzZQiuTa22XCgQ&s=0FHAerXtIarScvH_uCSwm1_6fcvnwOZkJn0a > JVxX8lI&e= > i_Storage_Manager > > Online documentation: > https://urldefense.proofpoint.com/v2/url?u=http-3A__www.ibm.com_suppor > t_knowledgecenter_SSGSG7_welcome&d=AwIFAg&c=-dg2m7zWuuDZ0MUcV7Sdqw&r=c > U4lgHg-mogg3FJ7Okdd3I2i9Cl4aPnV7nm0FbEjOWY&m=jaMir-5Mj0MJ5eKdK-8UTNsNe > 9iNYDzZQiuTa22XCgQ&s=z6-KhygfJ8cQDURcjUIT7KQ90l7u4VzmOw8W522aB7U&e= > Product Wiki: > https://urldefense.proofpoint.com/v2/url?u=https-3A__www.ibm.com_devel > operworks_community_wikis_home_wiki_Tivoli-2520&d=AwIFAg&c=-dg2m7zWuuD > Z0MUcV7Sdqw&r=cU4lgHg-mogg3FJ7Okdd3I2i9Cl4aPnV7nm0FbEjOWY&m=jaMir-5Mj0 > MJ5eKdK-8UTNsNe9iNYDzZQiuTa22XCgQ&s=ap1r_YAKONXTJN1XAZO-DhocN1rgS298b0 > 4t05J9Ai4&e= > Storage%20Manager > > "ADSM: Dist Stor Manager" <[email protected]> wrote on 2015-02-13 > 10:41:55: > >> From: "Rhodes, Richard L." <[email protected]> >> To: [email protected] >> Date: 2015-02-13 10:44 >> Subject: FW: v6.3.5 hung db2?? >> Sent by: "ADSM: Dist Stor Manager" <[email protected]> >> >> Now this is really weird. >> >> TSM came up after we rebooted. But it threw a bunch of ANR9999 msgs, >> then QUIT LOGGING. It seems to be running - I go onto a server and >> did a incr bkup, but nothing is logging in the actlog. >> >> 02/13/15 10:00:22 ANR9999D_2891663292 GetDomainByNodeId >> (pmcache.c:2645) Thread<280>: Node id 626 not found in table >> Policy.Domain.Members. (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> issued message 9999 >> from: (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010001ca7c >> StdPutText (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010001d514 >> OutDiagToCons (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001000090bc >> outDiagfExt (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001004bf254 >> GetDomainByNodeId (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001004beeec >> pmOpenDomain (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001006ac78c >> BeginVbTxn (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001006a4068 >> SmNodeSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010053ca64 >> SmSchedSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001005525d8 >> HandleNodeSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100549c54 >> DoNodeSched (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100544900 >> smExecuteSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100078a7c >> psSessionThread (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010000c264 >> StartThread (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D_3095886799 HandleShortCircuitCodes >> (dbieval.c:1072) Thread<280>: Invalid handle used from tbtbl.c >> (10153). (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> issued message 9999 >> from: (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010001ca7c >> StdPutText (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010001d514 >> OutDiagToCons (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001000090bc >> outDiagfExt (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001000cbb28 >> HandleShortCircuitCodes (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001000cb0a0 >> DbiEvalSQLOutcomeX (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001000a0a18 >> TblClose (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010019b13c >> FreeTxnDesc (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010019af14 >> dbiEndTxn (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001000458bc >> DoEndFuncCallbacks (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100045d70 >> tmAbortX (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001004bef60 >> pmOpenDomain (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001006ac78c >> BeginVbTxn (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001006a4068 >> SmNodeSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010053ca64 >> SmSchedSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x00000001005525d8 >> HandleNodeSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100549c54 >> DoNodeSched (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100544900 >> smExecuteSession (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x0000000100078a7c >> psSessionThread (SESSION: 125) >> 02/13/15 10:00:22 ANR9999D Thread<280> 0x000000010000c264 >> StartThread (SESSION: 125) >> >> It then threw this error and STOPPED LOGGING into actlog. >> >> 02/13/15 10:03:24 ANR0103E admattrm.c(806): Error 2332 >> updating row in table "Global.Attributes". >> >> >> >> >> From: Rhodes, Richard L. >> Sent: Friday, February 13, 2015 9:49 AM >> To: adsm-l mailing list ([email protected]) >> Subject: v6.3.5 hung db2?? >> >> Two days ago we upgrade one of our TSM instances to v6.3.5 (from v6.3.4). >> This is our first v6.3.5 instance. It runs on a AIX server. >> >> Last night at 19:32 it looks like DB2 went into some kind of a loop. >> The instance became unresponsive. Dsmadmc cmds hung (didn't error, >> just hung). >> Dsmserv process was getting almost no cpu, while ds2sync was running >> the > box >> At 65-70% but had no disk I/O. I killed dsmserv, but db2 didn't go down. >> I tried db2stop but it did nothing. Finally rebooted to get >> everything > up. >> The actlog shows no nasty errors. >> >> Just wondering if anyone else has had a runaway db2. >> >> Thanks >> >> Rick >> >> >> >> >> >> >> ----------------------------------------- >> >> The information contained in this message is intended only for the >> personal and confidential use of the recipient(s) named above. If the >> reader of this message is not the intended recipient or an agent >> responsible for delivering it to the intended recipient, you are >> hereby notified that you have received this document in error and >> that any review, dissemination, distribution, or copying of this >> message is strictly prohibited. If you have received this >> communication in error, please notify us immediately, and delete the >> original message. >> > > ----------------------------------------- > > The information contained in this message is intended only for the personal > and confidential use of the recipient(s) named above. If the reader of this > message is not the intended recipient or an agent responsible for delivering > it to the intended recipient, you are hereby notified that you have received > this document in error and that any review, dissemination, distribution, or > copying of this message is strictly prohibited. If you have received this > communication in error, please notify us immediately, and delete the original > message. > > -----------------------------------------The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message. -----------------------------------------The information contained in this message is intended only for the personal and confidential use of the recipient(s) named above. If the reader of this message is not the intended recipient or an agent responsible for delivering it to the intended recipient, you are hereby notified that you have received this document in error and that any review, dissemination, distribution, or copying of this message is strictly prohibited. If you have received this communication in error, please notify us immediately, and delete the original message.
