Hey Mike, Anytime! Would be happy to help more as you guys progress.
Take care and keep rockin' on! Cheers, Chris On May 9, 2012, at 8:42 AM, Iwunze, Michael C (GSFC-4700)[NOAA-JPSS] wrote: > Thanks Chris, this was good information. > > On 5/8/12 1:48 AM, "Mattmann, Chris A" <[email protected]> > wrote: > >> Hey Cam, >> >> Thanks, some comments below: >> >> On May 7, 2012, at 8:26 PM, Cameron Goodale wrote: >> >>> Hey Mike and Sheryl, >>> >>> Mike was asking me for some similar advice and I plain ran outta talent on >>> this topic. From what I can tell Mike would like to run his python scripts >>> on Resource Manager without the need for setting up Workflow or PGE. >>> >>> At the time I hadn't really thought through all the configuration files >>> needed, but having stewed on it I thought I should reply. Now my current >>> SnowDS implementation is to have the Workflow Task reference a CAS-PGE >>> (which contains the execution block for my python program i want to run). >>> Then the Workflow is merely configured to farm the jobs out to the >>> Resource Manager. >>> >>> Here is a list of questions that I have started to wonder about with Mike's >>> help, any answers would be appreciated: >>> >>> 1. Can Resource Manager + Batchstubs be used without any additional OODT >>> components? >>> >> >> Yep one way to see this in action is to run the >> org.apache.oodt.cas.resource.tools.JobSubmitter >> tool by cd'ing into a resource manager deployment (let's assume >> /usr/local/resmgr/bin) and then >> running: >> >> java -Djava.ext.dirs=../lib org.apache.oodt.cas.resource.tools.JobSubmitter >> >> Which produces: >> >> JobSubmitter --rUrl <resource mgr url> [options] >> --file <job file path> >> [--dir <job file dir path>] >> >> This will let you submit a resource manager XML "job file" looks like this: >> >> http://svn.apache.org/repos/asf/oodt/trunk/resource/src/main/resources/example >> s/jobs/exJob.xml >> >> Key parameters there are: >> >> Name - the human readable name of the job >> Id - the id of the job >> Instance Class - the JobInstance >> Input Class - specification for how to read/write input for the job, with >> properties >> >> That being said, interfacing with the resource manager at this level would be >> a lot harder >> than simply running workflows, which yes, is the more developer/user friendly >> interface >> for specifying tasks to run, which get turned into jobs in resource manager >> ville. >> >> >>> 2. Is PGE required to run/wrap non-Java programs so they can run within >>> Resource Manager? >> >> Well, PGE doesn't directly run in Resource Manager. All workflow tasks are >> submitted >> to Resource Manager using the TaskJob, and TaskJobInput constructs: >> >> http://s.apache.org/I6S >> http://s.apache.org/8F1 >> >> >>> >>> Closing comments to Mike: >>> >>> If you are planning to use OODT for data management, it >>> is initially very tempting to only setup and configure the minimal set of >>> components because you will feel productive and it feels like progress is >>> being made. Trust me I know since I was in your shoes about 6 months ago >>> when trying to get some image processing IDL code to run and I bably needed >>> to see progress (notice I didn't use the works "make progress"). Because I >>> wanted to use (what I thought was) the "easier" solution I ended up >>> hardcoding paths to resources my python code needed in the code instead of >>> passing the parameters into the code in the first place. This worked >>> reasonably well as long as everything stayed the same....but then it didn't >>> so I had to re-visit my "easier" setup and fix it. >>> Recently I have been working to undo my mistakes and python has been >>> very forgiving, but the best part was that all the strange and mystic >>> Workflow setups and PGEConfig.xml files actually started to make a whole >>> lot more sense. I am now able to configure and stand up a complete >>> workflow config, then jump into PGEConfig and get the input parameters to >>> my python code. This means if the input files i need to process changes I >>> don't need to change my python code, instead I can merely pass in a >>> different set of parameters into the workflow and they will persist to my >>> wrapped python. >>> In short I know that combing through all the xml config is tough, >>> especially when things are not working as quickly as you would like. I >>> understand how defeated and frustrating it can be to have the component >>> fail and just feel lost, not knowing what is causing the problem. I know >>> the documentation isn't perfect and sometimes it is missing altogether, but >>> the people that are on this list will bend over backwards to help you >>> understand (some will even share their config files with line-by-line >>> comments included at no extra charge ;) >>> >>> Thank you Sheryl for being awesome and helpful (you always are). Mike keep >>> the questions coming and I will be sure to add in my $0.02 when I am able >>> to. >> >> +1. >> >> Cheers, >> Chris >> >>> >>> On Mon, May 7, 2012 at 5:09 PM, Sheryl John <[email protected]> wrote: >>> >>>> Hi Mike, >>>> >>>> Yup, you can run your python scripts, java programs etc. from CAS-PGE which >>>> is used with the Workflow Manager. Check out this cas-pge guide [1] and the >>>> other wiki pages related to workflow. >>>> >>>> You can use Resource Manager to run tasks sent from the Workflow Manager. >>>> I've recently started testing this but there are others on the list who can >>>> guide you more on the Resource Manager. >>>> >>>> HTH! >>>> >>>> Sheryl >>>> >>>> [1] https://cwiki.apache.org/OODT/cas-pge-learn-by-example.html >>>> >>>> >>>> On Mon, May 7, 2012 at 3:43 PM, Iwunze, Michael C (GSFC-4700)[NOAA-JPSS] < >>>> [email protected]> wrote: >>>> >>>>> >>>>> I have two questions, I am able to run the Resource Manager with no >>>>> issues. I have some python scripts and possibly some other programs I >>>>> would like to run using the Resource Manager. From what I know so far I >>>>> believe the cas-pge component needs to be used in conjunction with the >>>>> Resource Manager and is used as a wrapper program for running my scripts. >>>>> Can someone give me more information on how this can be accomplished or >>>> are >>>>> there any examples to view? >>>>> >>>>> I would also like to be able to utilize the Job Scheduler, Monitor and >>>>> Job queue classes that are part of the Resource Manager. I can't find any >>>>> examples of how they are used anywhere. And if examples do exist can >>>>> someone point me in the right direction or give me more information on >>>> this? >>>>> >>>>> Thanks >>>>> >>>>> Mike >>>> >>>> >>>> >>>> >>>> -- >>>> -Sheryl >>>> >> >> >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Senior Computer Scientist >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 171-266B, Mailstop: 171-246 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Assistant Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> > ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Senior Computer Scientist NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 171-266B, Mailstop: 171-246 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Adjunct Assistant Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
