+1 billion

--Paul

Sent from my iPhone

> On Oct 8, 2014, at 5:55 PM, Lewis John Mcgibbney <[email protected]> 
> wrote:
> 
> Folks,
> Is it possible to create a parent issue for defining XSD's for all of the
> XML file we need ti OODT?
> I do not know them all, but from this thread alone, it is clear that we
> could do with setting some kind of restrictions on what can be included
> within task and configuration XML within OODT.
> Thoughts?
> Lewis
> 
> On Wed, Oct 8, 2014 at 5:44 PM, Verma, Rishi (398J) <
> [email protected]> wrote:
> 
>> Hi Val,
>> 
>> Yep - here’s a link to the tasks.xml file:
>> 
>> https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-netscan/workflow/src/main/resources/policy/tasks.xml
>> 
>>> The problem is that the ExternScriptTaskInstance is unable to recognize
>> the command line arguments that I want to pass to the crawler_launcher
>> script.
>> 
>> 
>> Hmm.. could you share your workflow manager log, or better yet, the
>> batch_stub output? Curious to see what error is thrown.
>> 
>> Is a script file being generated for your PGE? For example, inside your
>> [PGE_HOME] directory, and within the particular job directory created for
>> your execution of a workflow, you will see some files starting with
>> “sciPgeExeScript_…”. You’ll find one for your pgeConfig, and you can check
>> to see what the PGE commands actually translate into, with respect to a
>> shell script format. If that file is there, take a look at it, and validate
>> whether the command works within the script (i.e. copy/paste and run the
>> crawler command manually).
>> 
>> Another suggestion is to take a step back, and build up slowly, i.e.:
>> 1. Do an “echo” command within your PGE first. (e.g. <cmd> echo “Hello
>> APL.” > /tmp/test.txt</cmd>)
>> 2. If above works, do a crawler_launcher empty command(e.g.
>> <cmd>/path/to/oodt/crawler/bin/crawler_launcher</cmd>) and verify the
>> batch_stub or Workflow Manager prints some kind of output when you run the
>> workflow.
>> 3. Build up your crawler_launcher command piece by piece to see where it
>> is failing
>> 
>> Thanks,
>> Rishi
>> 
>> On Oct 8, 2014, at 4:24 PM, Mallder, Valerie <[email protected]>
>> wrote:
>> 
>>> Hi Rishi,
>>> 
>>> Thank you very much for pointing me to your working example. This is
>> very helpful.  My pgeConfig looks very similar to yours.  So, I commented
>> out the resource manager like you suggested and tried running again without
>> the resource manager. And my problem still exists. The problem is that the
>> ExternScriptTaskInstance is unable to recognize the command line arguments
>> that I want to pass to the crawler_launcher script. Could you send me a
>> link to your tasks.xml file? I'm curious as to how you defined your task.
>> My pgeConfig and tasks.xml are below.
>>> 
>>> Thanks!
>>> Val
>>> 
>>> 
>>> <?xml version="1.0" encoding="UTF-8"?>
>>> <pgeConfig>
>>> 
>>>  <!-- How to run the PGE -->
>>>  <exe dir="[JobDir]" shell="/bin/sh" envReplace="true">
>>>       <cmd>[CRAWLER_HOME]/bin/crawler_launcher --operation
>> --launchAutoCrawler \
>>>       --filemgrUrl [FILEMGR_URL] \
>>>       --clientTransferer
>> org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory \
>>>       --productPath [JobInputDir] \
>>>       --mimeExtractorRepo
>> [OODT_HOME]/extensions/policy/mime-extractor-map.xml \
>>>       --actionIds MoveFileToLevel0Dir</cmd>
>>>  </exe>
>>> 
>>>  <!-- Files to ingest -->
>>>  <output/>
>>>  </output>
>>> 
>>> <!-- Custom metadata to add to output files -->
>>>  <customMetadata>
>>>     <metadata key="JobDir" val="[OODT_HOME]"/>
>>>     <metadata key="JobInputDir" val="[FEI_DROP_DIR]"/>
>>>     <metadata key="JobOutputDir" val="[JobDir]/data/pge/jobs"/>
>>>     <metadata key="JobLogDir" val="[JobDir]/data/pge/logs"/>
>>>  </customMetadata>
>>> 
>>> </pgeConfig>
>>> 
>>> 
>>> 
>>> <!-- tasks.xml **************************************************-->
>>> 
>>> <cas:tasks xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas";>
>>> 
>>>  <task id="urn:oodt:crawlerLauncherId" name="crawlerLauncherName"
>> class="org.apache.oodt.cas.workflow.examples.ExternScriptTaskInstance">
>>>     <conditions/>  <!-- There are no pre execution conditions right now
>> -->
>>>     <configuration>
>>> 
>>>         <property name="ShellType" value="/bin/sh" />
>>>         <property name="PathToScript"
>> value="[CRAWLER_HOME]/bin/crawler_launcher" envReplace="true" />
>>> 
>>>         <property name="PGETask_Name" value="crawler_launcher PGE
>> Task"/>
>>>         <property name="PGETask_ConfigFilePath"
>> value="[OODT_HOME]/extensions/config/crawler-pge-config.xml"
>> envReplace="true" />
>>>     </configuration>
>>>  </task>
>>> 
>>> </cas:tasks>
>>> 
>>> Valerie A. Mallder
>>> New Horizons Deputy Mission System Engineer
>>> Johns Hopkins University/Applied Physics Laboratory
>>> 
>>> 
>>>> -----Original Message-----
>>>> From: Verma, Rishi (398J) [mailto:[email protected]]
>>>> Sent: Wednesday, October 08, 2014 6:01 PM
>>>> To: [email protected]
>>>> Subject: Re: what is batch stub? Is it necessary?
>>>> 
>>>> Hi Valerie,
>>>> 
>>>>>>>> All I am trying to do is run "crawler_launcher" as a workflow task
>>>>>>>> in the CAS PGE environment.
>>>> 
>>>> Interesting. I have a working example here [1] you can look at that
>> does this exact
>>>> thing.
>>>> 
>>>>>>>> So, if "batchstub" is necessary in this scenario, pleast tell me
>>>>>>>> what it is, why it is necessary, and how to run it (please provide
>>>>>>>> exact syntax to put in my startup shell script, because I would
>>>>>>>> never be able to figure it out for myself and I don't want to have
>>>>>>>> to bother everyone again.)
>>>> 
>>>> Batchstub is only necessary if your Workflow Manger is sending jobs to
>> Resource
>>>> Manager for execution (where the default execution is to run the job in
>> something
>>>> called a ?batch stub? executable). Think of batch stubs as a small
>> wrapper
>>>> program that takes a bundle of executable instructions from Resource
>> Manager,
>>>> and executes them in a shell environment within a given remote (or
>> local) machine.
>>>> 
>>>> Here?s my suggestion:
>>>> 1. Like Paul suggested, go to $OODT_HOME/resmgr/bin, and execute the
>>>> following command (it?ll start a batch stub in a terminal on port 2001):
>>>>> ./batch_stub 2001
>>>> 
>>>> If the above step doesn?t fix your problem, you can also try having
>> Workflow
>>>> Manager NOT send jobs to Resource Manager for execution, and instead
>> execute
>>>> jobs locally through Workflow Manager itself (on localhost only!). To
>> disable job
>>>> transfer to Resource Manger, you?ll need to modify the Workflow Manager
>>>> properties file ($OODT_HOME/wmgr/etc/workflow.properties), and
>> specifically
>>>> comment out the ?org.apache.oodt.cas.workflow.engine.resourcemgr.url?
>> line.
>>>> I?ve done this in my example code below, see [2] for an exact example
>> of this.
>>>> After modifying workflow.properties, make sure to restart workflow
>> manager
>>>> ($OODT_HOME/wmgr/bin/wmgr stop   followed by $OODT_HOME/wmgr/bin/wmgr
>>>> start).
>>>> 
>>>> Thanks,
>>>> Rishi
>>>> 
>>>> [1] https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-
>> netscan/pge/src/main/resources/policy/netscan-getipv4entriesrandomsample.xml
>>>> [2] https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-
>>>> netscan/workflow/src/main/resources/etc/workflow.properties
>>>> 
>>>> On Oct 8, 2014, at 2:31 PM, Ramirez, Paul M (398J)
>>>> <[email protected]> wrote:
>>>> 
>>>>> Valerie,
>>>>> 
>>>>> I would have thought it would have just not used a batch stub by
>> default. That
>>>> said if you go into the $OODT_HOME/resmgr/bin there should be a script
>> to start a
>>>> batch stub. Right now on my phone I forget the name of the script but
>> if you more
>>>> the file you will see the Java class name that corresponds to below.
>> You should
>>>> specify a port when you run the script which from the looks of the
>> output below
>>>> should be 2001.
>>>>> 
>>>>> HTH,
>>>>> Paul R
>>>>> 
>>>>> Sent from my iPhone
>>>>> 
>>>>>> On Oct 8, 2014, at 2:04 PM, Mallder, Valerie <
>> [email protected]>
>>>> wrote:
>>>>>> 
>>>>>> Well then, I'm proud to be a member :)  (I think .... )
>>>>>> 
>>>>>> 
>>>>>> Valerie A. Mallder
>>>>>> New Horizons Deputy Mission System Engineer Johns Hopkins
>>>>>> University/Applied Physics Laboratory
>>>>>> 
>>>>>> 
>>>>>>> -----Original Message-----
>>>>>>> From: Bruce Barkstrom [mailto:[email protected]]
>>>>>>> Sent: Wednesday, October 08, 2014 4:54 PM
>>>>>>> To: [email protected]
>>>>>>> Subject: Re: what is batch stub? Is it necessary?
>>>>>>> 
>>>>>>> You have every right to bother everyone.
>>>>>>> You won't get what you need unless you do.
>>>>>>> 
>>>>>>> You get one honorary membership in the Society of General Agitators
>>>>>>> - at the rank of Major Agitator.
>>>>>>> 
>>>>>>> Bruce B.
>>>>>>> 
>>>>>>> On Wed, Oct 8, 2014 at 4:49 PM, Mallder, Valerie
>>>>>>> <[email protected]
>>>>>>>> wrote:
>>>>>>> 
>>>>>>>> Hello,
>>>>>>>> 
>>>>>>>> I am still having trouble getting my CAS PGE crawler task to run
>>>>>>>> due to
>>>>>>>> http://localhost:2001 being "down". I have spent the last 2 days
>>>>>>>> tracing through the resource manager code and tracked this down to
>>>>>>>> line 146 of LRUScheduler where the XmlRpcBatchMgr is failing to
>>>>>>>> execute the task remotely, because on line 75 of
>>>>>>>> XmlRpcBatchMgrProxy (that was instantiated by XmlRpcBatchMgr on its
>>>>>>>> line 74) is trying to call "isAlive" on the webservice named
>>>>>>>> "batchstub" which, to my knowledge, is not running because I have
>> not done
>>>> anything explicitly to run it.
>>>>>>>> 
>>>>>>>> All I am trying to do is run "crawler_launcher" as a workflow task
>>>>>>>> in the CAS PGE environment.  I had it running perfectly before I
>>>>>>>> started trying to make it run as part of a workflow.  I really miss
>>>>>>>> my crawler and really want it to run again L
>>>>>>>> 
>>>>>>>> So, if "batchstub" is necessary in this scenario, pleast tell me
>>>>>>>> what it is, why it is necessary, and how to run it (please provide
>>>>>>>> exact syntax to put in my startup shell script, because I would
>>>>>>>> never be able to figure it out for myself and I don't want to have
>>>>>>>> to bother everyone again.)
>>>>>>>> 
>>>>>>>> Thanks so much!
>>>>>>>> 
>>>>>>>> Val
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> 
>>>>>>>> Valerie A. Mallder
>>>>>>>> 
>>>>>>>> New Horizons Deputy Mission System Engineer The Johns Hopkins
>>>>>>>> University/Applied Physics Laboratory
>>>>>>>> 11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723
>>>>>>>> 240-228-7846 (Office) 410-504-2233 (Blackberry)
>>>> 
>>>> ---
>>>> Rishi Verma
>>>> NASA Jet Propulsion Laboratory
>>>> California Institute of Technology
>> 
>> ---
>> Rishi Verma
>> NASA Jet Propulsion Laboratory
>> California Institute of Technology
> 
> 
> -- 
> *Lewis*

Reply via email to