[gt-user] Problems with PBS SEG on Torque 5.1.
Hello, We're been running Globus at our site for internal users to submit jobs to our Moab/Torque cluster for word flow work. Over the holidays we upgraded Torque from version 4.1.x to version 5.1.2. Since the upgrade, any attempt to get job status on a submitted job always returns "PENDING". We're running on RHEL6, using the Globus.org yum repo and everything related to PBS is fully updated. % rpm -qa | grep pbs globus-gram-job-manager-pbs-2.4-2.el6+gt6.x86_64 globus-gram-job-manager-pbs-setup-seg-2.4-2.el6+gt6.x86_64 I can't find a lot of details on trouble shooting Torque issues with Globus online. I confirmed it is running the job, the job completes, output files are there, etc, but the events related to job starting and job ending appear to be getting lost. There has been no new files/entries created in /var/lib/globus/globus-seg-pbs since we updated torque. I've restarted all the Globus services with no affect. I know that between version 4 and version 5 of Torque the log format changed. I was wondering if there was an alternate version of the PBS packages that might support the updated log format or some other configuration options I need to set to get it to use the updated log format so it can create the files SEG relies on. Thanks, -Brad Viviano === Brad Viviano High Performance Computing & Scientific Visualization Lockheed Martin IS&GS - Civil, Supporting the EPA Research Triangle Park, NC 919-541-2696
Re: [gt-user] Problems with PBS SEG on Torque 5.1.
Joe, The "poll" solution works fine. We've switched to poll until a resolution is discovered for SGE. As I said, there where lots of changes in the format for Torque files with 5.x and later, we've had to change many of our own internal tools that parse the torque logs to accommodate those changes. Thanks, -Brad === Brad Viviano High Performance Computing & Scientific Visualization Lockheed Martin IS&GS - Civil, Supporting the EPA Research Triangle Park, NC 919-541-2696 From: Joseph Bester Sent: Wednesday, January 20, 2016 7:35 AM To: Viviano, Brad Cc: gt-user@lists.globus.org Subject: Re: [gt-user] Problems with PBS SEG on Torque 5.1. I'm looking into this. Joe > On Jan 19, 2016, at 8:44 AM, Viviano, Brad wrote: > > Hello, > We're been running Globus at our site for internal users to submit jobs > to our Moab/Torque cluster for word flow work. Over the holidays we upgraded > Torque from version 4.1.x to version 5.1.2. Since the upgrade, any attempt > to get job status on a submitted job always returns "PENDING". > We're running on RHEL6, using the Globus.org yum repo and everything > related to PBS is fully updated. > > % rpm -qa | grep pbs > globus-gram-job-manager-pbs-2.4-2.el6+gt6.x86_64 > globus-gram-job-manager-pbs-setup-seg-2.4-2.el6+gt6.x86_64 > > I can't find a lot of details on trouble shooting Torque issues with Globus > online. I confirmed it is running the job, the job completes, output files > are there, etc, but the events related to job starting and job ending appear > to be getting lost. There has been no new files/entries created in > /var/lib/globus/globus-seg-pbs since we updated torque. I've restarted all > the Globus services with no affect. > > I know that between version 4 and version 5 of Torque the log format changed. > I was wondering if there was an alternate version of the PBS packages that > might support the updated log format or some other configuration options I > need to set to get it to use the updated log format so it can create the > files SEG relies on. > > Thanks, > -Brad Viviano > > === > Brad Viviano > High Performance Computing & Scientific Visualization > Lockheed Martin IS&GS - Civil, Supporting the EPA > Research Triangle Park, NC > 919-541-2696
Re: [gt-user] Problems with PBS SEG on Torque 5.1.
Joe, I installed the updated packages and did a couple of simple tests (globus-job-submit / globus-job-status). Everything works on my testing setup. I'll get some of our development team to test it with their work flow work and report back if we run into any problems. Thanks, -Brad === Brad Viviano High Performance Computing & Scientific Visualization Lockheed Martin IS&GS - Civil, Supporting the EPA Research Triangle Park, NC 919-541-2696 From: Joseph Bester Sent: Wednesday, January 20, 2016 2:30 PM To: Viviano, Brad Cc: gt-user@lists.globus.org Subject: Re: [gt-user] Problems with PBS SEG on Torque 5.1. I uploaded a new set of globus-gram-job-manager-pbs packages (RPM version 2.5-1.el6+gt6) to the unstable repository that I think should fix these issues. If you install GT via the globus-toolkit-repo-latest package, you can change /etc/yum.repos.d/globus-toolkit-6-unstable-el6.repo to have enable=1 for the repos you should be able to update to those packages. I'd like to get some feedback on this before moving them to the stable release stream. Joe > On Jan 20, 2016, at 7:42 AM, Viviano, Brad wrote: > > Joe, >The "poll" solution works fine. We've switched to poll until a resolution > is discovered for SGE. As I said, there where lots of changes in the format > for Torque files with 5.x and later, we've had to change many of our own > internal tools that parse the torque logs to accommodate those changes. > >Thanks, > -Brad > > === > Brad Viviano > High Performance Computing & Scientific Visualization > Lockheed Martin IS&GS - Civil, Supporting the EPA > Research Triangle Park, NC > 919-541-2696 > > ________ > From: Joseph Bester > Sent: Wednesday, January 20, 2016 7:35 AM > To: Viviano, Brad > Cc: gt-user@lists.globus.org > Subject: Re: [gt-user] Problems with PBS SEG on Torque 5.1. > > I'm looking into this. > > Joe > >> On Jan 19, 2016, at 8:44 AM, Viviano, Brad wrote: >> >> Hello, >>We're been running Globus at our site for internal users to submit jobs >> to our Moab/Torque cluster for word flow work. Over the holidays we >> upgraded Torque from version 4.1.x to version 5.1.2. Since the upgrade, any >> attempt to get job status on a submitted job always returns "PENDING". >>We're running on RHEL6, using the Globus.org yum repo and everything >> related to PBS is fully updated. >> >> % rpm -qa | grep pbs >> globus-gram-job-manager-pbs-2.4-2.el6+gt6.x86_64 >> globus-gram-job-manager-pbs-setup-seg-2.4-2.el6+gt6.x86_64 >> >> I can't find a lot of details on trouble shooting Torque issues with Globus >> online. I confirmed it is running the job, the job completes, output files >> are there, etc, but the events related to job starting and job ending appear >> to be getting lost. There has been no new files/entries created in >> /var/lib/globus/globus-seg-pbs since we updated torque. I've restarted all >> the Globus services with no affect. >> >> I know that between version 4 and version 5 of Torque the log format >> changed. I was wondering if there was an alternate version of the PBS >> packages that might support the updated log format or some other >> configuration options I need to set to get it to use the updated log format >> so it can create the files SEG relies on. >> >> Thanks, >> -Brad Viviano >> >> === >> Brad Viviano >> High Performance Computing & Scientific Visualization >> Lockheed Martin IS&GS - Civil, Supporting the EPA >> Research Triangle Park, NC >> 919-541-2696 >