Hi Val, Thanks for the detailed report. My suggestion would be to use CAS-PGE directly instead of ExternScriptTaskInstance. That application is not well maintained, doesn¹t produce a log, etc, etc, all of the things you¹ve noted.
CAS-PGE on the other hand, will (a) prepare input for your task; (b) describe how to run your task (even as a script and will generate a script); and (c) will run met extractors and fork a crawler in your job directory in the end. I think it¹s what you¹re looking for and it¹s way more well documented on the wiki. Please check it out and let me know what you think. Cheers, Chris ------------------------ Chris Mattmann [email protected] -----Original Message----- From: "Mallder, Valerie" <[email protected]> Reply-To: <[email protected]> Date: Monday, October 6, 2014 at 11:53 PM To: "[email protected]" <[email protected]> Subject: how to pass arguments to workflow task that is external script >Hello, > >I'm stuck again L This time I'm stuck trying to start my crawler as a >task using the workflow manager. I am not using a PGE task right now. >I'm just trying to do something simple with the workflow manager, >filemgr, and crawler. I have read all of the documentation that is >available on the workflow manager and have tried to piece together a >setup based on the examples, but, things seem to be working differently >now and the documentation hasn't caught up, which is totally >understandable and not a criticism. Just want you to know that I try to >do my due diligence before bothering anyone for help. > >I am not running the resource manager, and I have commented out setting >the resource manager url in the workflow.properties file so that workflow >manager will execute the job locally. > >I am sending workflow manager an event (via the command line using >wmgr-client) called "startJediPipeline". Workflow manager receives the >event, and retrieves my workflow from the repository and tries to execute >the first (and only) task, and then it crashes. My task is an external >script (the crawler_launcher script) and I need to pass several arguments >to it. I've spent all day trying to figure out how to pass arguments to >the and ExternScriptTaskInstance, but there are no examples of doing >this, so I had to wing it. I tried putting the arguments in the task >configuration properties. That didn't work. So I tried putting the >arguments in the metadata properties, and that hasn't worked. So, your >suggestions are welcome! Thanks so much. Here's the error log, And >contents of my tasks.xml file follow it at the end. > >Workflow Manager started PID file >(/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/workflow/run >/cas.workflow.pid). >Starting OODT File Manager [ Successful ] >Starting OODT Resource Manager [ Failed ] >Starting OODT Workflow Manager [ Successful ] >slothrop:{~/project/jedi/users/jedi-pipeline/oodt-deploy/bin} Oct 06, >2014 5:48:30 PM org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager >loadProperties >INFO: Loading Workflow Manager Configuration Properties from: >[/homes/malldva1/project/jedi/users/jedi-pipeline/oodt-deploy/workflow/etc >/workflow.properties] >Oct 06, 2014 5:48:30 PM >org.apache.oodt.cas.workflow.engine.ThreadPoolWorkflowEngineFactory >getResmgrUrl >INFO: No Resource Manager URL provided or malformed URL: executing jobs >locally. URL: [null] >Oct 06, 2014 5:48:30 PM >org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager <init> >INFO: Workflow Manager started by malldva1 >Oct 06, 2014 5:48:41 PM >org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager handleEvent >INFO: WorkflowManager: Received event: startJediPipeline >Oct 06, 2014 5:48:41 PM >org.apache.oodt.cas.workflow.system.XmlRpcWorkflowManager handleEvent >INFO: WorkflowManager: Workflow Jedi Pipeline Workflow retrieved for >event startJediPipeline >Oct 06, 2014 5:48:41 PM >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread >checkTaskRequiredMetadata >INFO: Task: [Crawler Task] has no required metadata fields >Oct 06, 2014 5:48:42 PM >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread >executeTaskLocally >INFO: Executing task: [Crawler Task] locally >java.lang.NullPointerException > at >org.apache.oodt.cas.workflow.examples.ExternScriptTaskInstance.run(ExternS >criptTaskInstance.java:72) > at >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread.execu >teTaskLocally(IterativeWorkflowProcessorThread.java:574) > at >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread.run(I >terativeWorkflowProcessorThread.java:321) > at >EDU.oswego.cs.dl.util.concurrent.PooledExecutor$Worker.run(Unknown Source) > at java.lang.Thread.run(Thread.java:745) >Oct 06, 2014 5:48:42 PM >org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread >executeTaskLocally >WARNING: Exception executing task: [Crawler Task] locally: Message: null > > > > ><cas:tasks xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas"> ><!-- > TODO: Add some examples >--> > > <task id="urn:oodt:crawlerTask" name="Crawler Task" >class="org.apache.oodt.cas.workflow.examples.ExternScriptTaskInstance"/> > <conditions/> <!-- There are no pre execution conditions right now >--> > <configuration> > <property name="ShellType" value="/bin/sh" /> > <property name="PathToScript" >value="[OODT_HOME]/crawler/bin/crawler_launcher"/> > </configuration> > <metadata> > <args> > <arg>--operation</arg> > <arg>--launchAutoCrawler</arg> > <arg>--productPath</arg> > <arg>[OODT_HOME]/data/staging</arg> > <arg>--filemgrUrl</arg> > <arg>http://localhost:9000</arg> > <arg>--clientTransferer</arg> > ><arg>org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory</ar >g> > <arg>--mimeExtractorRepo</arg> > ><arg>[$OODT_HOME]/extensions/policy/mime-extractor-map.xml</arg> > <arg>--actionIds</arg> > <arg>MoveFileToLevel0Dir</arg> > </args> > </metadata> ></cas:tasks> > > >Valerie A. Mallder > >New Horizons Deputy Mission System Engineer >The Johns Hopkins University/Applied Physics Laboratory >11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723 >240-228-7846 (Office) 410-504-2233 (Blackberry) >
