ack, git push

------------------------
Chris Mattmann
[email protected]




-----Original Message-----
From: "Ramirez, Paul M (398J)" <[email protected]>
Reply-To: <[email protected]>
Date: Thursday, October 9, 2014 at 4:37 AM
To: "<[email protected]>" <[email protected]>
Subject: Re: what is batch stub? Is it necessary?

>+1 billion
>
>--Paul
>
>Sent from my iPhone
>
>> On Oct 8, 2014, at 5:55 PM, Lewis John Mcgibbney
>><[email protected]> wrote:
>> 
>> Folks,
>> Is it possible to create a parent issue for defining XSD's for all of
>>the
>> XML file we need ti OODT?
>> I do not know them all, but from this thread alone, it is clear that we
>> could do with setting some kind of restrictions on what can be included
>> within task and configuration XML within OODT.
>> Thoughts?
>> Lewis
>> 
>> On Wed, Oct 8, 2014 at 5:44 PM, Verma, Rishi (398J) <
>> [email protected]> wrote:
>> 
>>> Hi Val,
>>> 
>>> Yep - here¹s a link to the tasks.xml file:
>>> 
>>> 
>>>https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-netscan/wo
>>>rkflow/src/main/resources/policy/tasks.xml
>>> 
>>>> The problem is that the ExternScriptTaskInstance is unable to
>>>>recognize
>>> the command line arguments that I want to pass to the crawler_launcher
>>> script.
>>> 
>>> 
>>> Hmm.. could you share your workflow manager log, or better yet, the
>>> batch_stub output? Curious to see what error is thrown.
>>> 
>>> Is a script file being generated for your PGE? For example, inside your
>>> [PGE_HOME] directory, and within the particular job directory created
>>>for
>>> your execution of a workflow, you will see some files starting with
>>> ³sciPgeExeScript_в. You¹ll find one for your pgeConfig, and you can
>>>check
>>> to see what the PGE commands actually translate into, with respect to a
>>> shell script format. If that file is there, take a look at it, and
>>>validate
>>> whether the command works within the script (i.e. copy/paste and run
>>>the
>>> crawler command manually).
>>> 
>>> Another suggestion is to take a step back, and build up slowly, i.e.:
>>> 1. Do an ³echo² command within your PGE first. (e.g. <cmd> echo ³Hello
>>> APL.² > /tmp/test.txt</cmd>)
>>> 2. If above works, do a crawler_launcher empty command(e.g.
>>> <cmd>/path/to/oodt/crawler/bin/crawler_launcher</cmd>) and verify the
>>> batch_stub or Workflow Manager prints some kind of output when you run
>>>the
>>> workflow.
>>> 3. Build up your crawler_launcher command piece by piece to see where
>>>it
>>> is failing
>>> 
>>> Thanks,
>>> Rishi
>>> 
>>> On Oct 8, 2014, at 4:24 PM, Mallder, Valerie
>>><[email protected]>
>>> wrote:
>>> 
>>>> Hi Rishi,
>>>> 
>>>> Thank you very much for pointing me to your working example. This is
>>> very helpful.  My pgeConfig looks very similar to yours.  So, I
>>>commented
>>> out the resource manager like you suggested and tried running again
>>>without
>>> the resource manager. And my problem still exists. The problem is that
>>>the
>>> ExternScriptTaskInstance is unable to recognize the command line
>>>arguments
>>> that I want to pass to the crawler_launcher script. Could you send me a
>>> link to your tasks.xml file? I'm curious as to how you defined your
>>>task.
>>> My pgeConfig and tasks.xml are below.
>>>> 
>>>> Thanks!
>>>> Val
>>>> 
>>>> 
>>>> <?xml version="1.0" encoding="UTF-8"?>
>>>> <pgeConfig>
>>>> 
>>>>  <!-- How to run the PGE -->
>>>>  <exe dir="[JobDir]" shell="/bin/sh" envReplace="true">
>>>>       <cmd>[CRAWLER_HOME]/bin/crawler_launcher --operation
>>> --launchAutoCrawler \
>>>>       --filemgrUrl [FILEMGR_URL] \
>>>>       --clientTransferer
>>> org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory \
>>>>       --productPath [JobInputDir] \
>>>>       --mimeExtractorRepo
>>> [OODT_HOME]/extensions/policy/mime-extractor-map.xml \
>>>>       --actionIds MoveFileToLevel0Dir</cmd>
>>>>  </exe>
>>>> 
>>>>  <!-- Files to ingest -->
>>>>  <output/>
>>>>  </output>
>>>> 
>>>> <!-- Custom metadata to add to output files -->
>>>>  <customMetadata>
>>>>     <metadata key="JobDir" val="[OODT_HOME]"/>
>>>>     <metadata key="JobInputDir" val="[FEI_DROP_DIR]"/>
>>>>     <metadata key="JobOutputDir" val="[JobDir]/data/pge/jobs"/>
>>>>     <metadata key="JobLogDir" val="[JobDir]/data/pge/logs"/>
>>>>  </customMetadata>
>>>> 
>>>> </pgeConfig>
>>>> 
>>>> 
>>>> 
>>>> <!-- tasks.xml **************************************************-->
>>>> 
>>>> <cas:tasks xmlns:cas="http://oodt.jpl.nasa.gov/1.0/cas";>
>>>> 
>>>>  <task id="urn:oodt:crawlerLauncherId" name="crawlerLauncherName"
>>> class="org.apache.oodt.cas.workflow.examples.ExternScriptTaskInstance">
>>>>     <conditions/>  <!-- There are no pre execution conditions right
>>>>now
>>> -->
>>>>     <configuration>
>>>> 
>>>>         <property name="ShellType" value="/bin/sh" />
>>>>         <property name="PathToScript"
>>> value="[CRAWLER_HOME]/bin/crawler_launcher" envReplace="true" />
>>>> 
>>>>         <property name="PGETask_Name" value="crawler_launcher PGE
>>> Task"/>
>>>>         <property name="PGETask_ConfigFilePath"
>>> value="[OODT_HOME]/extensions/config/crawler-pge-config.xml"
>>> envReplace="true" />
>>>>     </configuration>
>>>>  </task>
>>>> 
>>>> </cas:tasks>
>>>> 
>>>> Valerie A. Mallder
>>>> New Horizons Deputy Mission System Engineer
>>>> Johns Hopkins University/Applied Physics Laboratory
>>>> 
>>>> 
>>>>> -----Original Message-----
>>>>> From: Verma, Rishi (398J) [mailto:[email protected]]
>>>>> Sent: Wednesday, October 08, 2014 6:01 PM
>>>>> To: [email protected]
>>>>> Subject: Re: what is batch stub? Is it necessary?
>>>>> 
>>>>> Hi Valerie,
>>>>> 
>>>>>>>>> All I am trying to do is run "crawler_launcher" as a workflow
>>>>>>>>>task
>>>>>>>>> in the CAS PGE environment.
>>>>> 
>>>>> Interesting. I have a working example here [1] you can look at that
>>> does this exact
>>>>> thing.
>>>>> 
>>>>>>>>> So, if "batchstub" is necessary in this scenario, pleast tell me
>>>>>>>>> what it is, why it is necessary, and how to run it (please
>>>>>>>>>provide
>>>>>>>>> exact syntax to put in my startup shell script, because I would
>>>>>>>>> never be able to figure it out for myself and I don't want to
>>>>>>>>>have
>>>>>>>>> to bother everyone again.)
>>>>> 
>>>>> Batchstub is only necessary if your Workflow Manger is sending jobs
>>>>>to
>>> Resource
>>>>> Manager for execution (where the default execution is to run the job
>>>>>in
>>> something
>>>>> called a ?batch stub? executable). Think of batch stubs as a small
>>> wrapper
>>>>> program that takes a bundle of executable instructions from Resource
>>> Manager,
>>>>> and executes them in a shell environment within a given remote (or
>>> local) machine.
>>>>> 
>>>>> Here?s my suggestion:
>>>>> 1. Like Paul suggested, go to $OODT_HOME/resmgr/bin, and execute the
>>>>> following command (it?ll start a batch stub in a terminal on port
>>>>>2001):
>>>>>> ./batch_stub 2001
>>>>> 
>>>>> If the above step doesn?t fix your problem, you can also try having
>>> Workflow
>>>>> Manager NOT send jobs to Resource Manager for execution, and instead
>>> execute
>>>>> jobs locally through Workflow Manager itself (on localhost only!). To
>>> disable job
>>>>> transfer to Resource Manger, you?ll need to modify the Workflow
>>>>>Manager
>>>>> properties file ($OODT_HOME/wmgr/etc/workflow.properties), and
>>> specifically
>>>>> comment out the ?org.apache.oodt.cas.workflow.engine.resourcemgr.url?
>>> line.
>>>>> I?ve done this in my example code below, see [2] for an exact example
>>> of this.
>>>>> After modifying workflow.properties, make sure to restart workflow
>>> manager
>>>>> ($OODT_HOME/wmgr/bin/wmgr stop   followed by $OODT_HOME/wmgr/bin/wmgr
>>>>> start).
>>>>> 
>>>>> Thanks,
>>>>> Rishi
>>>>> 
>>>>> [1] https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-
>>> 
>>>netscan/pge/src/main/resources/policy/netscan-getipv4entriesrandomsample
>>>.xml
>>>>> [2] https://github.com/riverma/xdata-jpl-netscan/blob/master/oodt-
>>>>> netscan/workflow/src/main/resources/etc/workflow.properties
>>>>> 
>>>>> On Oct 8, 2014, at 2:31 PM, Ramirez, Paul M (398J)
>>>>> <[email protected]> wrote:
>>>>> 
>>>>>> Valerie,
>>>>>> 
>>>>>> I would have thought it would have just not used a batch stub by
>>> default. That
>>>>> said if you go into the $OODT_HOME/resmgr/bin there should be a
>>>>>script
>>> to start a
>>>>> batch stub. Right now on my phone I forget the name of the script but
>>> if you more
>>>>> the file you will see the Java class name that corresponds to below.
>>> You should
>>>>> specify a port when you run the script which from the looks of the
>>> output below
>>>>> should be 2001.
>>>>>> 
>>>>>> HTH,
>>>>>> Paul R
>>>>>> 
>>>>>> Sent from my iPhone
>>>>>> 
>>>>>>> On Oct 8, 2014, at 2:04 PM, Mallder, Valerie <
>>> [email protected]>
>>>>> wrote:
>>>>>>> 
>>>>>>> Well then, I'm proud to be a member :)  (I think .... )
>>>>>>> 
>>>>>>> 
>>>>>>> Valerie A. Mallder
>>>>>>> New Horizons Deputy Mission System Engineer Johns Hopkins
>>>>>>> University/Applied Physics Laboratory
>>>>>>> 
>>>>>>> 
>>>>>>>> -----Original Message-----
>>>>>>>> From: Bruce Barkstrom [mailto:[email protected]]
>>>>>>>> Sent: Wednesday, October 08, 2014 4:54 PM
>>>>>>>> To: [email protected]
>>>>>>>> Subject: Re: what is batch stub? Is it necessary?
>>>>>>>> 
>>>>>>>> You have every right to bother everyone.
>>>>>>>> You won't get what you need unless you do.
>>>>>>>> 
>>>>>>>> You get one honorary membership in the Society of General
>>>>>>>>Agitators
>>>>>>>> - at the rank of Major Agitator.
>>>>>>>> 
>>>>>>>> Bruce B.
>>>>>>>> 
>>>>>>>> On Wed, Oct 8, 2014 at 4:49 PM, Mallder, Valerie
>>>>>>>> <[email protected]
>>>>>>>>> wrote:
>>>>>>>> 
>>>>>>>>> Hello,
>>>>>>>>> 
>>>>>>>>> I am still having trouble getting my CAS PGE crawler task to run
>>>>>>>>> due to
>>>>>>>>> http://localhost:2001 being "down". I have spent the last 2 days
>>>>>>>>> tracing through the resource manager code and tracked this down
>>>>>>>>>to
>>>>>>>>> line 146 of LRUScheduler where the XmlRpcBatchMgr is failing to
>>>>>>>>> execute the task remotely, because on line 75 of
>>>>>>>>> XmlRpcBatchMgrProxy (that was instantiated by XmlRpcBatchMgr on
>>>>>>>>>its
>>>>>>>>> line 74) is trying to call "isAlive" on the webservice named
>>>>>>>>> "batchstub" which, to my knowledge, is not running because I have
>>> not done
>>>>> anything explicitly to run it.
>>>>>>>>> 
>>>>>>>>> All I am trying to do is run "crawler_launcher" as a workflow
>>>>>>>>>task
>>>>>>>>> in the CAS PGE environment.  I had it running perfectly before I
>>>>>>>>> started trying to make it run as part of a workflow.  I really
>>>>>>>>>miss
>>>>>>>>> my crawler and really want it to run again L
>>>>>>>>> 
>>>>>>>>> So, if "batchstub" is necessary in this scenario, pleast tell me
>>>>>>>>> what it is, why it is necessary, and how to run it (please
>>>>>>>>>provide
>>>>>>>>> exact syntax to put in my startup shell script, because I would
>>>>>>>>> never be able to figure it out for myself and I don't want to
>>>>>>>>>have
>>>>>>>>> to bother everyone again.)
>>>>>>>>> 
>>>>>>>>> Thanks so much!
>>>>>>>>> 
>>>>>>>>> Val
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Valerie A. Mallder
>>>>>>>>> 
>>>>>>>>> New Horizons Deputy Mission System Engineer The Johns Hopkins
>>>>>>>>> University/Applied Physics Laboratory
>>>>>>>>> 11100 Johns Hopkins Rd (MS 23-282), Laurel, MD 20723
>>>>>>>>> 240-228-7846 (Office) 410-504-2233 (Blackberry)
>>>>> 
>>>>> ---
>>>>> Rishi Verma
>>>>> NASA Jet Propulsion Laboratory
>>>>> California Institute of Technology
>>> 
>>> ---
>>> Rishi Verma
>>> NASA Jet Propulsion Laboratory
>>> California Institute of Technology
>> 
>> 
>> -- 
>> *Lewis*


Reply via email to