Thanks Professor Mattmann, not running batch_stub was the main culprit and
there were some other issues such as missing jars; and sorry for not
confirming this right away, my laptop was actually crashing, and i just had
time to fix it; BTW, I was able to get the cas-pge example to work, (even
though I saw the workflow failed to pass the pre-condition in the log, the
combined file and some metadata files (i.e.3 files) were still successfully
ingested and placed in the output directory) 

BTW, i think there are a lot of mistakes in the documents, do you want us to
help correct the document(i.e.
https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example )?
If possible, I would like to please share my notes with some problem steps
mentioned there. 

Anyway, thanks for your help and appreciated.

Thanks
Luke
-----Original Message-----
From: Mattmann, Chris A (3980) [mailto:[email protected]] 
Sent: Saturday, November 1, 2014 10:48 AM
To: Luke; [email protected]
Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; 'Zichuan
Wang'
Subject: Re: re: Question about OODT file manager

Dear Luke, just confirming, we solved this in class right? It had
to do with the batch stub not being turned on.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Chief Architect
Instrument Software and Science Data Systems Section (398)
NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA
Office: 168-519, Mailstop: 168-527
Email: [email protected]
WWW:  http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Adjunct Associate Professor, Computer Science Department
University of Southern California, Los Angeles, CA 90089 USA
++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++






-----Original Message-----
From: Luke <[email protected]>
Date: Tuesday, October 28, 2014 at 12:52 PM
To: Chris Mattmann <[email protected]>, "[email protected]"
<[email protected]>
Cc: Chris Mattmann <[email protected]>, "[email protected]"
<[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang'
<[email protected]>
Subject: RE: re: Question about OODT file manager

>Dear Professor Mattamnn,
>Thanks a lot Professor Mattmann for the kind help, it is appreciated,
>sorry for getting back to you with my appreciation, I have been
>conducting tests with OODT based on your advice, but unfortunately I am
>having another problem....
>
>I am following the steps
>(https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example
>) to get a sense of how to get workflow to work.
>The problem is that the File-Concatenator-PGE (by running the wmgr-client
>command-line) does not seems to be invoked or executed, but I am seeing
>the tasks are getting stacked up in the workflow manager with status
>either "RSUBMIT" or "QUEUED", but they are not getting executed, PFA:
>workflow_monitor.jpg, please note, by default the workflow min pool size
>is 6; so here comes another problem, i have 6 submitted tasks with status
>RSUBMIT, but any new incoming tasks will be forwarded to the waiting
>QUEUE with status "QUEUED"...please refer to the workflow_monitor.jpg for
>details, where I have 3 QUEUED workflow task and 6 RSUMBITE tasks.
>
>Question 1): not sure why the workflow is not being executed, and hanging
>at the state of "RSUBMIT", after enabling the log level, I am seeing the
>following entry in the log, not sure if this has anything to do with the
>"hanging" problem where workflow is not getting executed and hanging at
>state of "RSUBMIT".
>       Oct 28, 2014 3:35:07 AM
>org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread
>safeCheckJobComplete
>       WARNING: Exception checking completion status for job:
>[2014-10-28T01:59:32.813-07:00]: Messsage: java.lang.Exception:
>java.lang.NullPointerException
>
>Question 2): I think currently on my side any new incoming workflow task
>I am sending with the following command is being directed to the waiting
>"QUEUE" because of the min pool size (i.e. 6) (I can increase this to a
>larger number though),
>                       ./wmgr-client --url http://localhost:9200
--operation --sendEvent
>--eventName fileconcatenator-pge --metaData --key RunID testNumber1
>       If possible, I would like to please know if there is a way we can
purge
>the queue and get rid of those workflow tasks either in "RSUMBIT" and
>"QUEUED" I have already sent, please kindly help.
>
>Very sorry for troubling you with this, to be honest I find OODT a bit
>challenging to grasp within a short time frame, probably because there is
>no book like OODT in action like Solr.... and what I am doing is just
>trial and error blended with guess, but I don’t want to make a blind
>guess, it will be appreciated if you can please also shed some lights on
>where I can get more information logging or other way where I can
>troubleshoot. I think it might be worth tracking what is happening when
>workflow reach the status "RSUBMIT" and how to get a specific logging
>info specific to it...
>
>Again your advice and kind help will be appreciated usual.
>
>
>Thanks
>Luke
>
>> -----Original Message-----
>> From: Mattmann, Chris A (3980) [mailto:[email protected]]
>> Sent: 2014年10月26日 22:18
>> To: Luke; 'Zichuan Wang'
>> Cc: 'Christian Alan Mattmann'; [email protected]; [email protected];
>> [email protected]
>> Subject: Re: re: Question about OODT file manager
>> 
>> Hi Luke,
>> 
>> Thanks and sorry it’s taken me a while to reply. Here are some details
>>below:
>> 
>> 
>> -----Original Message-----
>> From: Luke <[email protected]>
>> Date: Sunday, October 26, 2014 at 6:19 PM
>> To: Chris Mattmann <[email protected]>, 'Zichuan Wang'
>> <[email protected]>
>> Cc: Chris Mattmann <[email protected]>, "[email protected]"
>> <[email protected]>, "[email protected]" <[email protected]>,
>> "[email protected]" <[email protected]>
>> Subject: RE: re: Question about OODT file manager
>> 
>> >Hi Professor Mattmann and OODT DEV,
>> >
>> >Sorry to trouble you with this email, our team has been struggling in
>> >the oodt to send json files to solr.
>> >One of the difficulties is still getting OODT workflow to call the
>> >poster.py in etllib.
>> 
>> Sorry that you’re having difficulty let me try and help.
>> 
>> >
>> >I am not sure if my understanding is correct with OODT requirement, I
>> >hope you can please kindly advice and help with our confusion.
>> >
>> >a set of goals in my mind with OODT is as follows, please kindly
>> >confirm and clarify:
>> >
>> >1)
>> >Get the File-Manager up and running.
>> 
>> Yep, hopefully as installed via OODT RADIX.
>> 
>> >2)
>> >send all json files with command wmgr-client to the fileManager server.
>> >(I believe we can achieve it with a bash script or probably  python
>> >that calls the command line sequentially with each json file name as an
>> >argument?!)
>> 
>> Suggestion:
>> 
>> 1. Use the OODT crawler and file manager to crawl/index the JSON files
>>(in
>> place data transfer).
>> 2. Take a look at CAS-PGE, it will help you write a workflow task that
>>will wrap
>> ETLlib and the poster command.
>> 3. Once you are confident with #2, whip up a script that pages through
>>all of
>> your indexed JSON files, and then for each one, submits a workflow
>>event (you
>> may need to look into aggregating them) that calls your CAS-PGE wrapped
>> poster task from ETLlib.
>> 
>> >3)
>> >Once we have json files sent and stored in the File-Manager, we need to
>> >get workflow-manager up and running, and we can create a workflow  that
>> >send those jsons file from the file manager to solr.
>> 
>> See above.
>> 
>> >4)
>> >Create a workflow according to
>> >Workflow2 User Guide
>> ><https://cwiki.apache.org/confluence/display/OODT/Workflow2+User+Guide>
>> >>>>>>>>>>> here comes the problem…..
>> >         I am not sure how to create a workflow task which can call the
>> >poster.py in python etllib, it looks like we need to create our own
>> >java  class that extend <TaskInstance> which is an abstract Java class
>> >with one abstract method that has the following signature:
>> >
>> >
>> >protectedabstract ResultsState performExecution(ControlMetadata
>> >crtlMetadata);
>> >         However, the detail of where to find the corresponding libs
>> >and where to put our implementation in workflow manager is being
>> >neglected  in that page.  I am not sure if we should use TaskInstance,
>> >but it seems the workflow has to have an interface thru which it can
>> >call the python code i.e. poster.py. and it looks like we need to
>> >embody the TaskInstance::performExecution by injecting the code  that
>> >calls the poster.py and return the resultState.
>> >
>> >
>> >It would be greatly appreciated if you could please shed some lights
>> >and advice how we can get a task instance to call the poster.py. BTW, I
>> >am  also not sure if my understanding is correct, please kindly correct
>> >it if inappropriate. Your help will be appreciated as usual.
>> >
>> >
>> >
>> >Thanks
>> >Luke
>> 
>> Thanks Luke, see above. Let me know if it helps.
>> 
>> Cheers!
>> 
>> Chris
>> 
>> >
>> >From: Mattmann, Chris A (3980) [mailto:[email protected]]
>> >
>> >Sent: 2014年10月25日
>> > 13:34
>> >To: Zichuan Wang
>> >Cc: Christian Alan Mattmann; Luke; [email protected]; [email protected]
>> >Subject: Re: 回复: Question about OODT file manager
>> >
>> >
>> >
>> >Please cc
>> >[email protected] <mailto:[email protected]> I will reply in detail
>> >soon
>> >
>> >Sent from my iPhone
>> 
>> 
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++
>> Chris Mattmann, Ph.D.
>> Chief Architect
>> Instrument Software and Science Data Systems Section (398) NASA Jet
>> Propulsion Laboratory Pasadena, CA 91109 USA
>> Office: 168-519, Mailstop: 168-527
>> Email: [email protected]
>> WWW:  http://sunset.usc.edu/~mattmann/
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++
>> Adjunct Associate Professor, Computer Science Department University of
>> Southern California, Los Angeles, CA 90089 USA
>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>> ++
>> 
>> 
>> 
>> 
>> 
>> 
>> >
>> >
>> >On Oct 25, 2014, at 1:26 PM, "Zichuan Wang" <[email protected]> wrote:
>> >
>> >
>> >Dear Professor,
>> >
>> >
>> >
>> >Could please also explain how I can crawl all JSON file name under a
>> >specific directory using CAS-PGE? I’ll work through this example
>> >https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Exam
>> p
>> >le,  but it doesn’t mention anything about crawling, instead it
>> >manually set the Input files paths...
>> >
>> >
>> >
>> >
>> >--
>> >
>> >Zichuan Wang
>> >
>> >University of Southern California, Department of Computer Science
>> >
>> >
>> >
>> >
>> >在 2014年10月25日 星期六,下午12:10,Zichuan Wang
>> >写道:
>> >
>> >Dear Professor,
>> >
>> >
>> >
>> >In assignment 2 specification I noticed that you mentioned OODT File
>> >Manager, but from my understanding, we are using ETLLib poster which
>> >talks directly to Solr. So how can we use OODT File Manager in this
>> >assignment?
>> >
>> >
>> >
>> >--
>> >
>> >Zichuan Wang
>> >
>> >University of Southern California, Department of Computer Science
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>> >
>


Reply via email to