Thanks Luke, I’ve given you permissions so you should now see an
“edit” button on that wiki page.

Cheers,
Chris

++++++++++++++++++++++++++++++++++++++++++++++++++++++++
Chris Mattmann, Ph.D.
Adjunct Associate Professor, Computer Science Department
University of Southern California
Los Angeles, CA 90089 USA
Email: [email protected]
WWW: http://sunset.usc.edu/~mattmann/
++++++++++++++++++++++++++++++++++++++++++++++++++++++++




-----Original Message-----
From: Luke liu <[email protected]>
Date: Wednesday, November 5, 2014 at 6:48 PM
To: Chris Mattmann <[email protected]>, "[email protected]"
<[email protected]>
Cc: Chris Mattmann <[email protected]>, "[email protected]"
<[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang'
<[email protected]>
Subject: RE: re: Question about OODT file manager

>I just signed up on the wiki(i.e. https://cwiki.apache.org ) with the
>following account detail:
>       Account name: luke
>       Full Name: Shuai Liu (Luke)
>       Email: [email protected]
>       Password: *******
>
>But I am not sure where I can add my notes to the following web article
>with
>which I had trouble , I also tried to create a new article, but failed to
>do
>it as I cannot find a place where I can edit, does this have something do
>with my account that is not visible for the "edit" or "comments" action?
>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example
>
>
>Thanks
>Luke
>-----Original Message-----
>From: Mattmann, Chris A (3980) [mailto:[email protected]]
>Sent: Sunday, November 2, 2014 6:59 AM
>To: Luke liu; [email protected]
>Cc: 'Christian Alan Mattmann'; [email protected]; [email protected];
>'Zichuan
>Wang'
>Subject: Re: re: Question about OODT file manager
>
>Yes Luke, making the instructions better would be much appreciated!
>
>If you have an account on the wiki please share it, else sign up for an
>Apache OODT wiki account and please share it with me or anyone else on
>dev@oodt and we’ll add you.
>
>Cheers,
>Chris
>
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Chris Mattmann, Ph.D.
>Chief Architect
>Instrument Software and Science Data Systems Section (398) NASA Jet
>Propulsion Laboratory Pasadena, CA 91109 USA
>Office: 168-519, Mailstop: 168-527
>Email: [email protected]
>WWW:  http://sunset.usc.edu/~mattmann/
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>Adjunct Associate Professor, Computer Science Department University of
>Southern California, Los Angeles, CA 90089 USA
>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>
>
>
>
>
>
>-----Original Message-----
>From: Luke liu <[email protected]>
>Date: Sunday, November 2, 2014 at 1:32 AM
>To: Chris Mattmann <[email protected]>, "[email protected]"
><[email protected]>
>Cc: Chris Mattmann <[email protected]>, "[email protected]"
><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang'
><[email protected]>
>Subject: RE: re: Question about OODT file manager
>
>>Thanks Professor Mattmann, not running batch_stub was the main culprit
>>and there were some other issues such as missing jars; and sorry for
>>not confirming this right away, my laptop was actually crashing, and i
>>just had time to fix it; BTW, I was able to get the cas-pge example to
>>work, (even though I saw the workflow failed to pass the pre-condition
>>in the log, the combined file and some metadata files (i.e.3 files)
>>were still successfully ingested and placed in the output directory)
>>
>>BTW, i think there are a lot of mistakes in the documents, do you want
>>us to help correct the document(i.e.
>>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Examp
>>le
>>)?
>>If possible, I would like to please share my notes with some problem
>>steps mentioned there.
>>
>>Anyway, thanks for your help and appreciated.
>>
>>Thanks
>>Luke
>>-----Original Message-----
>>From: Mattmann, Chris A (3980) [mailto:[email protected]]
>>Sent: Saturday, November 1, 2014 10:48 AM
>>To: Luke; [email protected]
>>Cc: 'Christian Alan Mattmann'; [email protected]; [email protected];
>>'Zichuan Wang'
>>Subject: Re: re: Question about OODT file manager
>>
>>Dear Luke, just confirming, we solved this in class right? It had to do
>>with the batch stub not being turned on.
>>
>>Cheers,
>>Chris
>>
>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>Chris Mattmann, Ph.D.
>>Chief Architect
>>Instrument Software and Science Data Systems Section (398) NASA Jet
>>Propulsion Laboratory Pasadena, CA 91109 USA
>>Office: 168-519, Mailstop: 168-527
>>Email: [email protected]
>>WWW:  http://sunset.usc.edu/~mattmann/
>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>Adjunct Associate Professor, Computer Science Department University of
>>Southern California, Los Angeles, CA 90089 USA
>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>
>>
>>
>>
>>
>>
>>-----Original Message-----
>>From: Luke <[email protected]>
>>Date: Tuesday, October 28, 2014 at 12:52 PM
>>To: Chris Mattmann <[email protected]>, "[email protected]"
>><[email protected]>
>>Cc: Chris Mattmann <[email protected]>, "[email protected]"
>><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang'
>><[email protected]>
>>Subject: RE: re: Question about OODT file manager
>>
>>>Dear Professor Mattamnn,
>>>Thanks a lot Professor Mattmann for the kind help, it is appreciated,
>>>sorry for getting back to you with my appreciation, I have been
>>>conducting tests with OODT based on your advice, but unfortunately I
>>>am having another problem....
>>>
>>>I am following the steps
>>>(https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Exa
>>>mpl
>>>e
>>>) to get a sense of how to get workflow to work.
>>>The problem is that the File-Concatenator-PGE (by running the
>>>wmgr-client
>>>command-line) does not seems to be invoked or executed, but I am
>>>seeing the tasks are getting stacked up in the workflow manager with
>>>status either "RSUBMIT" or "QUEUED", but they are not getting executed,
>PFA:
>>>workflow_monitor.jpg, please note, by default the workflow min pool
>>>size is 6; so here comes another problem, i have 6 submitted tasks
>>>with status RSUBMIT, but any new incoming tasks will be forwarded to
>>>the waiting QUEUE with status "QUEUED"...please refer to the
>>>workflow_monitor.jpg for details, where I have 3 QUEUED workflow task
>>>and
>6 RSUMBITE tasks.
>>>
>>>Question 1): not sure why the workflow is not being executed, and
>>>hanging at the state of "RSUBMIT", after enabling the log level, I am
>>>seeing the following entry in the log, not sure if this has anything
>>>to do with the "hanging" problem where workflow is not getting
>>>executed and hanging at state of "RSUBMIT".
>>>     Oct 28, 2014 3:35:07 AM
>>>org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread
>>>safeCheckJobComplete
>>>     WARNING: Exception checking completion status for job:
>>>[2014-10-28T01:59:32.813-07:00]: Messsage: java.lang.Exception:
>>>java.lang.NullPointerException
>>>
>>>Question 2): I think currently on my side any new incoming workflow
>>>task I am sending with the following command is being directed to the
>>>waiting "QUEUE" because of the min pool size (i.e. 6) (I can increase
>>>this to a larger number though),
>>>                     ./wmgr-client --url http://localhost:9200
>>--operation --sendEvent
>>>--eventName fileconcatenator-pge --metaData --key RunID testNumber1
>>>     If possible, I would like to please know if there is a way we can
>>purge
>>>the queue and get rid of those workflow tasks either in "RSUMBIT" and
>>>"QUEUED" I have already sent, please kindly help.
>>>
>>>Very sorry for troubling you with this, to be honest I find OODT a bit
>>>challenging to grasp within a short time frame, probably because there
>>>is no book like OODT in action like Solr.... and what I am doing is
>>>just trial and error blended with guess, but I don’t want to make a
>>>blind guess, it will be appreciated if you can please also shed some
>>>lights on where I can get more information logging or other way where
>>>I can troubleshoot. I think it might be worth tracking what is
>>>happening when workflow reach the status "RSUBMIT" and how to get a
>>>specific logging info specific to it...
>>>
>>>Again your advice and kind help will be appreciated usual.
>>>
>>>
>>>Thanks
>>>Luke
>>>
>>>> -----Original Message-----
>>>> From: Mattmann, Chris A (3980)
>>>> [mailto:[email protected]]
>>>> Sent: 2014年10月26日 22:18
>>>> To: Luke; 'Zichuan Wang'
>>>> Cc: 'Christian Alan Mattmann'; [email protected]; [email protected];
>>>> [email protected]
>>>> Subject: Re: re: Question about OODT file manager
>>>> 
>>>> Hi Luke,
>>>> 
>>>> Thanks and sorry it’s taken me a while to reply. Here are some
>>>>details
>>>>below:
>>>> 
>>>> 
>>>> -----Original Message-----
>>>> From: Luke <[email protected]>
>>>> Date: Sunday, October 26, 2014 at 6:19 PM
>>>> To: Chris Mattmann <[email protected]>, 'Zichuan Wang'
>>>> <[email protected]>
>>>> Cc: Chris Mattmann <[email protected]>, "[email protected]"
>>>> <[email protected]>, "[email protected]" <[email protected]>,
>>>> "[email protected]" <[email protected]>
>>>> Subject: RE: re: Question about OODT file manager
>>>> 
>>>> >Hi Professor Mattmann and OODT DEV,
>>>> >
>>>> >Sorry to trouble you with this email, our team has been struggling
>>>> >in the oodt to send json files to solr.
>>>> >One of the difficulties is still getting OODT workflow to call the
>>>> >poster.py in etllib.
>>>> 
>>>> Sorry that you’re having difficulty let me try and help.
>>>> 
>>>> >
>>>> >I am not sure if my understanding is correct with OODT requirement,
>>>> >I hope you can please kindly advice and help with our confusion.
>>>> >
>>>> >a set of goals in my mind with OODT is as follows, please kindly
>>>> >confirm and clarify:
>>>> >
>>>> >1)
>>>> >Get the File-Manager up and running.
>>>> 
>>>> Yep, hopefully as installed via OODT RADIX.
>>>> 
>>>> >2)
>>>> >send all json files with command wmgr-client to the fileManager
>>>>server.
>>>> >(I believe we can achieve it with a bash script or probably  python
>>>> >that calls the command line sequentially with each json file name
>>>> >as
>>>>an
>>>> >argument?!)
>>>> 
>>>> Suggestion:
>>>> 
>>>> 1. Use the OODT crawler and file manager to crawl/index the JSON
>>>>files (in  place data transfer).
>>>> 2. Take a look at CAS-PGE, it will help you write a workflow task
>>>>that will wrap  ETLlib and the poster command.
>>>> 3. Once you are confident with #2, whip up a script that pages
>>>>through all of  your indexed JSON files, and then for each one,
>>>>submits a workflow event (you  may need to look into aggregating
>>>>them) that calls your CAS-PGE wrapped  poster task from ETLlib.
>>>> 
>>>> >3)
>>>> >Once we have json files sent and stored in the File-Manager, we
>>>> >need
>>>>to
>>>> >get workflow-manager up and running, and we can create a workflow
>>>>that
>>>> >send those jsons file from the file manager to solr.
>>>> 
>>>> See above.
>>>> 
>>>> >4)
>>>> >Create a workflow according to
>>>> >Workflow2 User Guide
>>>> 
>>>>><https://cwiki.apache.org/confluence/display/OODT/Workflow2+User+Gui
>>>>>de>
>>>> >>>>>>>>>>> here comes the problem…..
>>>> >         I am not sure how to create a workflow task which can call
>>>>the
>>>> >poster.py in python etllib, it looks like we need to create our own
>>>> >java  class that extend <TaskInstance> which is an abstract Java
>>>> >class with one abstract method that has the following signature:
>>>> >
>>>> >
>>>> >protectedabstract ResultsState performExecution(ControlMetadata
>>>> >crtlMetadata);
>>>> >         However, the detail of where to find the corresponding
>>>> >libs and where to put our implementation in workflow manager is
>>>> >being neglected  in that page.  I am not sure if we should use
>>>> >TaskInstance, but it seems the workflow has to have an interface
>>>> >thru which it can call the python code i.e. poster.py. and it looks
>>>> >like we need to embody the TaskInstance::performExecution by
>>>> >injecting the code  that calls the poster.py and return the
>resultState.
>>>> >
>>>> >
>>>> >It would be greatly appreciated if you could please shed some
>>>> >lights and advice how we can get a task instance to call the
>>>> >poster.py. BTW,
>>>>I
>>>> >am  also not sure if my understanding is correct, please kindly
>>>>correct
>>>> >it if inappropriate. Your help will be appreciated as usual.
>>>> >
>>>> >
>>>> >
>>>> >Thanks
>>>> >Luke
>>>> 
>>>> Thanks Luke, see above. Let me know if it helps.
>>>> 
>>>> Cheers!
>>>> 
>>>> Chris
>>>> 
>>>> >
>>>> >From: Mattmann, Chris A (3980)
>>>> >[mailto:[email protected]]
>>>> >
>>>> >Sent: 2014年10月25日
>>>> > 13:34
>>>> >To: Zichuan Wang
>>>> >Cc: Christian Alan Mattmann; Luke; [email protected];
>>>> >[email protected]
>>>> >Subject: Re: 回复: Question about OODT file manager
>>>> >
>>>> >
>>>> >
>>>> >Please cc
>>>> >[email protected] <mailto:[email protected]> I will reply in
>>>>detail
>>>> >soon
>>>> >
>>>> >Sent from my iPhone
>>>> 
>>>> 
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> ++
>>>> Chris Mattmann, Ph.D.
>>>> Chief Architect
>>>> Instrument Software and Science Data Systems Section (398) NASA Jet
>>>> Propulsion Laboratory Pasadena, CA 91109 USA
>>>> Office: 168-519, Mailstop: 168-527
>>>> Email: [email protected]
>>>> WWW:  http://sunset.usc.edu/~mattmann/
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> ++
>>>> Adjunct Associate Professor, Computer Science Department University
>>>> of Southern California, Los Angeles, CA 90089 USA
>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
>>>> ++
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> 
>>>> >
>>>> >
>>>> >On Oct 25, 2014, at 1:26 PM, "Zichuan Wang" <[email protected]> wrote:
>>>> >
>>>> >
>>>> >Dear Professor,
>>>> >
>>>> >
>>>> >
>>>> >Could please also explain how I can crawl all JSON file name under
>>>> >a specific directory using CAS-PGE? I’ll work through this example
>>>> >https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+E
>>>> >xam
>>>> p
>>>> >le,  but it doesn’t mention anything about crawling, instead it
>>>> >manually set the Input files paths...
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >--
>>>> >
>>>> >Zichuan Wang
>>>> >
>>>> >University of Southern California, Department of Computer Science
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >在 2014年10月25日 星期六,下午12:10,Zichuan Wang
>>>> >写道:
>>>> >
>>>> >Dear Professor,
>>>> >
>>>> >
>>>> >
>>>> >In assignment 2 specification I noticed that you mentioned OODT
>>>> >File Manager, but from my understanding, we are using ETLLib poster
>>>> >which talks directly to Solr. So how can we use OODT File Manager
>>>> >in this assignment?
>>>> >
>>>> >
>>>> >
>>>> >--
>>>> >
>>>> >Zichuan Wang
>>>> >
>>>> >University of Southern California, Department of Computer Science
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>> >
>>>
>>
>>
>
>

Reply via email to