Some information posted for Art Eisenhour from Tivoli Advanced Support Let me add some background for those not familiar with shared USS file systems and provide guidance for this issue in the future.
TWS requires two directories in USS file systems, one for binary executable code and one for work areas. It is recommended that these two directories each reside in their own seperate file systems. This facilitates maintenance of the code by allowing an unmount of the current level and a mount of the new, and this facilitates reallocation of the workarea file system should it be at risk of running out of space. As Jim points out, these file systems should be owned by the z/OS Sysplex member where the TWS End-to-End server is running, which must be the same system where the Controller is running. In short, if the TWS Controller is moved or recovered to another Sysplex member, the TWS file systems ownership should be transferred to the same image. File system ownership can be moved automatically by z/OS through file system automove definitions in the case where the owning system is shutdown or fails. File system ownership can also be moved through the use of automation or operator commands. Automation can be used to move the file systems ownership as a part of a TWS Controller move process or when a standby Controller becomes active. Automation might trigger off of the Controller active message EQQN013I. Alternatively, a prestart step in the End-to end server start process could cause the move. Just so the move occurs before the End-to-end server is started. *Note that the ability to move the TWS started tasks to different Sysplex members comes standard with the IBM Systems Automation for z/OS. A move of file systems ownership could be added to SA definitions in the Prestart Phase for the End-to-end server.* For more information on managing USS file systems, reference publications, z/OS UNIX System Services Planning guide, GA22-7800-13 z/OS UNIX System Services Command Reference, SA22-7802-09 "When mounting file systems in the sysplex, you can specify a prioritized system list to indicate where the file system should or should not to moved to when the owning system leaves the sysplex changes due to any of the following: - A soft shutdown request has been issued. - Dead system takeover occurs (when a system leaves the sysplex without a prior soft shutdown). - A PFS terminates on the owning system. - A request to move ownership of the file system is issued." Information specific to TWS use of the USS file systems can be found in the publication, Tivoli Workload Scheduler for z/OS Installation Guide Version 8.3, SC32-1264-03, in the section, "Configuring for end-to-end scheduling in a SYSPLEX environment". "Having a shared HFS in a sysplex configuration means that all file systems are available to all systems participating in the shared HFS support. With the shared HFS support there is no I/O performance reduction for an HFS read-only (R/O). However, the intersystem communication (XCF) required for shared HFS may affect the response time on read/write (R/W) file systems being shared in a sysplex. For example, assume that a user on system SYS1 issued a read request to a file system owned R/W on system SYS2. Using shared HFS support, the read request message is sent via an XCF messaging function. After SYS2 receives the message, it gathers the requested data from the file and returns the data using the same request message. In many cases, when accessing data on a system which owns a file system, the file I/O time is only the path length to the buffer manager to retrieve the data from the cache. On the contrary, file I/O to a shared HFS from a client which does not own the mount, requires additional path length to be considered, plus the time involved in the XCF messaging function. Increased XCF message traffic is a factor which can contribute to performance degradation. For this reason, it is recommended for system files to be owned by the system where the end-to end server runs. On z/OS systems, the shared ZFS capability is available: all file systems that are mounted by a system participating in shared ZFS are available to all participating systems. When allocating the work directory in a shared ZFS you can decide to define it in a file system mounted under the system-specific ZFS or in a file system mounted under the sysplex root. A system-specific file system becomes unreachable if the system is not active. To make good use of the takeover process, define the work directory in a file system mounted under the sysplex root and defined as automove." Susan - Show quoted text - On Tue, Aug 12, 2008 at 2:06 PM, Jim Marshall <[EMAIL PROTECTED]> wrote: > Working with IBM Level 2 or maybe 3, we now understand what is causing the > excessive CPU time being used by the Distributed component of Tivoli > Workload Scheduler. I will review the scenario: > > Running a IBM 2096-O02 (36MSU) and 2096-T03 (95MSU) machines in a > Parallel Sysplex where TWS runs on the "O02" system (smaller of the two). > TWS is scheduling work in the Parallel Sysplex and also there is a > distributed > component for scheduling for 3-4 Windows Servers. Historically it is > interesting for TWS had its roots in an IBM product called OPC (Operator > Control) which did z/OS and distributed scheduling using "Trackers". It > worked > very well using little CPU time. OPC morfed itself into Tivoli and became > TWS > for z/OS and IBM bought a company called Maestro which did distributed > scheduling. The two products were merged and Trackers went away. It took > IBM a few years to fully integrate the two products. This brings it down to > the > present and performance issues encountered. > > TWS for z/OS runs separately from other Started Task for distributed TWS > called TWSE2E. TWSE2E was seen taking about 3 MSUs worth of the O02 > when the system used to run around 28-29 MSUs max in a month. IBM > researched the issue and came forth with the explanation which is not > highlighted in any of the Tivoli manuals as far as we can read. The TWSE2E > executes its programs in the O02's USS system and has files defined in a > zFS > file system. If indeed that zFS file system is not owned by the LPAR where > TWS is running, all the I/O must go through XCF in the Parallel Sysplex; > generating the extraordinary amounts of CPU time seen as being used by > TWSE2E in that LPAR. The recommendation now is always have the zFS file > system mounted to the LPAR where TWS is operating (otherwise TWSE2E will > eat your lunch, dinner, etc). When we switched TWS's zFS file system back > to > the TWS LPAR, the CPU consumption dropped to almost nothing. > > I can understand the recommendation and now it places some considerations > to ponder: > > 1. When a TWS LPAR is taken down the ownership of its zFS file system is > automagically transferred to some other LPAR and it is not your choice > which > one (another interesting discussion could follow this line). So when the > TWS > LPAR is IPL'ed, operationally one must ensure the proper commands are > issued > to bring back ownership of TWS's zFS file system. > > 2. One can implement all of #1 in "Automation" if one is running some > sort > of automation package; a good case for getting one. > > 3. Keep in mind this is not a Parallel Sysplex problem but a zFS > challenge. > > 4. I just have to wonder if all this is caused by I/O for TWSE2E having > to > go through XCF to get to the other LPAR where the zFS is owned, then why > not the WAIT associated with I/O versus the heavy, heavy CPU load caused > by this I/O (3-4 Windows Servers which get about 30-40 jobs per day)? > > Note: I just have to believe there is more to the story and it may > not be > a TWS problem but maybe TWS exploiting something in USS and zFS which is > a bad design. > > POSTSCRIPT: Things are back using an acceptable amount of CPU and > everyone is older and wiser. > > Jim > > P.S. Wonder how many other z/OS USS implementations are using excessive > CPU because of the ownership of some zFS file system. Will be on the watch > for something like it in the future. > > > ---------------------------------------------------------------------- > For IBM-MAIN subscribe / signoff / archive access instructions, > send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO > Search the archives at http://bama.ua.edu/archives/ibm-main.html > > ---------------------------------------------------------------------- For IBM-MAIN subscribe / signoff / archive access instructions, send email to [EMAIL PROTECTED] with the message: GET IBM-MAIN INFO Search the archives at http://bama.ua.edu/archives/ibm-main.html

