Thanks Luke, I’ve given you permissions so you should now see an “edit” button on that wiki page.
Cheers, Chris ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Adjunct Associate Professor, Computer Science Department University of Southern California Los Angeles, CA 90089 USA Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++ -----Original Message----- From: Luke liu <[email protected]> Date: Wednesday, November 5, 2014 at 6:48 PM To: Chris Mattmann <[email protected]>, "[email protected]" <[email protected]> Cc: Chris Mattmann <[email protected]>, "[email protected]" <[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang' <[email protected]> Subject: RE: re: Question about OODT file manager >I just signed up on the wiki(i.e. https://cwiki.apache.org ) with the >following account detail: > Account name: luke > Full Name: Shuai Liu (Luke) > Email: [email protected] > Password: ******* > >But I am not sure where I can add my notes to the following web article >with >which I had trouble , I also tried to create a new article, but failed to >do >it as I cannot find a place where I can edit, does this have something do >with my account that is not visible for the "edit" or "comments" action? >https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Example > > >Thanks >Luke >-----Original Message----- >From: Mattmann, Chris A (3980) [mailto:[email protected]] >Sent: Sunday, November 2, 2014 6:59 AM >To: Luke liu; [email protected] >Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; >'Zichuan >Wang' >Subject: Re: re: Question about OODT file manager > >Yes Luke, making the instructions better would be much appreciated! > >If you have an account on the wiki please share it, else sign up for an >Apache OODT wiki account and please share it with me or anyone else on >dev@oodt and we’ll add you. > >Cheers, >Chris > >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Chris Mattmann, Ph.D. >Chief Architect >Instrument Software and Science Data Systems Section (398) NASA Jet >Propulsion Laboratory Pasadena, CA 91109 USA >Office: 168-519, Mailstop: 168-527 >Email: [email protected] >WWW: http://sunset.usc.edu/~mattmann/ >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >Adjunct Associate Professor, Computer Science Department University of >Southern California, Los Angeles, CA 90089 USA >++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ > > > > > > >-----Original Message----- >From: Luke liu <[email protected]> >Date: Sunday, November 2, 2014 at 1:32 AM >To: Chris Mattmann <[email protected]>, "[email protected]" ><[email protected]> >Cc: Chris Mattmann <[email protected]>, "[email protected]" ><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang' ><[email protected]> >Subject: RE: re: Question about OODT file manager > >>Thanks Professor Mattmann, not running batch_stub was the main culprit >>and there were some other issues such as missing jars; and sorry for >>not confirming this right away, my laptop was actually crashing, and i >>just had time to fix it; BTW, I was able to get the cas-pge example to >>work, (even though I saw the workflow failed to pass the pre-condition >>in the log, the combined file and some metadata files (i.e.3 files) >>were still successfully ingested and placed in the output directory) >> >>BTW, i think there are a lot of mistakes in the documents, do you want >>us to help correct the document(i.e. >>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Examp >>le >>)? >>If possible, I would like to please share my notes with some problem >>steps mentioned there. >> >>Anyway, thanks for your help and appreciated. >> >>Thanks >>Luke >>-----Original Message----- >>From: Mattmann, Chris A (3980) [mailto:[email protected]] >>Sent: Saturday, November 1, 2014 10:48 AM >>To: Luke; [email protected] >>Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; >>'Zichuan Wang' >>Subject: Re: re: Question about OODT file manager >> >>Dear Luke, just confirming, we solved this in class right? It had to do >>with the batch stub not being turned on. >> >>Cheers, >>Chris >> >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>Chris Mattmann, Ph.D. >>Chief Architect >>Instrument Software and Science Data Systems Section (398) NASA Jet >>Propulsion Laboratory Pasadena, CA 91109 USA >>Office: 168-519, Mailstop: 168-527 >>Email: [email protected] >>WWW: http://sunset.usc.edu/~mattmann/ >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>Adjunct Associate Professor, Computer Science Department University of >>Southern California, Los Angeles, CA 90089 USA >>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> >> >> >> >> >> >>-----Original Message----- >>From: Luke <[email protected]> >>Date: Tuesday, October 28, 2014 at 12:52 PM >>To: Chris Mattmann <[email protected]>, "[email protected]" >><[email protected]> >>Cc: Chris Mattmann <[email protected]>, "[email protected]" >><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan Wang' >><[email protected]> >>Subject: RE: re: Question about OODT file manager >> >>>Dear Professor Mattamnn, >>>Thanks a lot Professor Mattmann for the kind help, it is appreciated, >>>sorry for getting back to you with my appreciation, I have been >>>conducting tests with OODT based on your advice, but unfortunately I >>>am having another problem.... >>> >>>I am following the steps >>>(https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Exa >>>mpl >>>e >>>) to get a sense of how to get workflow to work. >>>The problem is that the File-Concatenator-PGE (by running the >>>wmgr-client >>>command-line) does not seems to be invoked or executed, but I am >>>seeing the tasks are getting stacked up in the workflow manager with >>>status either "RSUBMIT" or "QUEUED", but they are not getting executed, >PFA: >>>workflow_monitor.jpg, please note, by default the workflow min pool >>>size is 6; so here comes another problem, i have 6 submitted tasks >>>with status RSUBMIT, but any new incoming tasks will be forwarded to >>>the waiting QUEUE with status "QUEUED"...please refer to the >>>workflow_monitor.jpg for details, where I have 3 QUEUED workflow task >>>and >6 RSUMBITE tasks. >>> >>>Question 1): not sure why the workflow is not being executed, and >>>hanging at the state of "RSUBMIT", after enabling the log level, I am >>>seeing the following entry in the log, not sure if this has anything >>>to do with the "hanging" problem where workflow is not getting >>>executed and hanging at state of "RSUBMIT". >>> Oct 28, 2014 3:35:07 AM >>>org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread >>>safeCheckJobComplete >>> WARNING: Exception checking completion status for job: >>>[2014-10-28T01:59:32.813-07:00]: Messsage: java.lang.Exception: >>>java.lang.NullPointerException >>> >>>Question 2): I think currently on my side any new incoming workflow >>>task I am sending with the following command is being directed to the >>>waiting "QUEUE" because of the min pool size (i.e. 6) (I can increase >>>this to a larger number though), >>> ./wmgr-client --url http://localhost:9200 >>--operation --sendEvent >>>--eventName fileconcatenator-pge --metaData --key RunID testNumber1 >>> If possible, I would like to please know if there is a way we can >>purge >>>the queue and get rid of those workflow tasks either in "RSUMBIT" and >>>"QUEUED" I have already sent, please kindly help. >>> >>>Very sorry for troubling you with this, to be honest I find OODT a bit >>>challenging to grasp within a short time frame, probably because there >>>is no book like OODT in action like Solr.... and what I am doing is >>>just trial and error blended with guess, but I don’t want to make a >>>blind guess, it will be appreciated if you can please also shed some >>>lights on where I can get more information logging or other way where >>>I can troubleshoot. I think it might be worth tracking what is >>>happening when workflow reach the status "RSUBMIT" and how to get a >>>specific logging info specific to it... >>> >>>Again your advice and kind help will be appreciated usual. >>> >>> >>>Thanks >>>Luke >>> >>>> -----Original Message----- >>>> From: Mattmann, Chris A (3980) >>>> [mailto:[email protected]] >>>> Sent: 2014年10月26日 22:18 >>>> To: Luke; 'Zichuan Wang' >>>> Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; >>>> [email protected] >>>> Subject: Re: re: Question about OODT file manager >>>> >>>> Hi Luke, >>>> >>>> Thanks and sorry it’s taken me a while to reply. Here are some >>>>details >>>>below: >>>> >>>> >>>> -----Original Message----- >>>> From: Luke <[email protected]> >>>> Date: Sunday, October 26, 2014 at 6:19 PM >>>> To: Chris Mattmann <[email protected]>, 'Zichuan Wang' >>>> <[email protected]> >>>> Cc: Chris Mattmann <[email protected]>, "[email protected]" >>>> <[email protected]>, "[email protected]" <[email protected]>, >>>> "[email protected]" <[email protected]> >>>> Subject: RE: re: Question about OODT file manager >>>> >>>> >Hi Professor Mattmann and OODT DEV, >>>> > >>>> >Sorry to trouble you with this email, our team has been struggling >>>> >in the oodt to send json files to solr. >>>> >One of the difficulties is still getting OODT workflow to call the >>>> >poster.py in etllib. >>>> >>>> Sorry that you’re having difficulty let me try and help. >>>> >>>> > >>>> >I am not sure if my understanding is correct with OODT requirement, >>>> >I hope you can please kindly advice and help with our confusion. >>>> > >>>> >a set of goals in my mind with OODT is as follows, please kindly >>>> >confirm and clarify: >>>> > >>>> >1) >>>> >Get the File-Manager up and running. >>>> >>>> Yep, hopefully as installed via OODT RADIX. >>>> >>>> >2) >>>> >send all json files with command wmgr-client to the fileManager >>>>server. >>>> >(I believe we can achieve it with a bash script or probably python >>>> >that calls the command line sequentially with each json file name >>>> >as >>>>an >>>> >argument?!) >>>> >>>> Suggestion: >>>> >>>> 1. Use the OODT crawler and file manager to crawl/index the JSON >>>>files (in place data transfer). >>>> 2. Take a look at CAS-PGE, it will help you write a workflow task >>>>that will wrap ETLlib and the poster command. >>>> 3. Once you are confident with #2, whip up a script that pages >>>>through all of your indexed JSON files, and then for each one, >>>>submits a workflow event (you may need to look into aggregating >>>>them) that calls your CAS-PGE wrapped poster task from ETLlib. >>>> >>>> >3) >>>> >Once we have json files sent and stored in the File-Manager, we >>>> >need >>>>to >>>> >get workflow-manager up and running, and we can create a workflow >>>>that >>>> >send those jsons file from the file manager to solr. >>>> >>>> See above. >>>> >>>> >4) >>>> >Create a workflow according to >>>> >Workflow2 User Guide >>>> >>>>><https://cwiki.apache.org/confluence/display/OODT/Workflow2+User+Gui >>>>>de> >>>> >>>>>>>>>>> here comes the problem….. >>>> > I am not sure how to create a workflow task which can call >>>>the >>>> >poster.py in python etllib, it looks like we need to create our own >>>> >java class that extend <TaskInstance> which is an abstract Java >>>> >class with one abstract method that has the following signature: >>>> > >>>> > >>>> >protectedabstract ResultsState performExecution(ControlMetadata >>>> >crtlMetadata); >>>> > However, the detail of where to find the corresponding >>>> >libs and where to put our implementation in workflow manager is >>>> >being neglected in that page. I am not sure if we should use >>>> >TaskInstance, but it seems the workflow has to have an interface >>>> >thru which it can call the python code i.e. poster.py. and it looks >>>> >like we need to embody the TaskInstance::performExecution by >>>> >injecting the code that calls the poster.py and return the >resultState. >>>> > >>>> > >>>> >It would be greatly appreciated if you could please shed some >>>> >lights and advice how we can get a task instance to call the >>>> >poster.py. BTW, >>>>I >>>> >am also not sure if my understanding is correct, please kindly >>>>correct >>>> >it if inappropriate. Your help will be appreciated as usual. >>>> > >>>> > >>>> > >>>> >Thanks >>>> >Luke >>>> >>>> Thanks Luke, see above. Let me know if it helps. >>>> >>>> Cheers! >>>> >>>> Chris >>>> >>>> > >>>> >From: Mattmann, Chris A (3980) >>>> >[mailto:[email protected]] >>>> > >>>> >Sent: 2014年10月25日 >>>> > 13:34 >>>> >To: Zichuan Wang >>>> >Cc: Christian Alan Mattmann; Luke; [email protected]; >>>> >[email protected] >>>> >Subject: Re: 回复: Question about OODT file manager >>>> > >>>> > >>>> > >>>> >Please cc >>>> >[email protected] <mailto:[email protected]> I will reply in >>>>detail >>>> >soon >>>> > >>>> >Sent from my iPhone >>>> >>>> >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> ++ >>>> Chris Mattmann, Ph.D. >>>> Chief Architect >>>> Instrument Software and Science Data Systems Section (398) NASA Jet >>>> Propulsion Laboratory Pasadena, CA 91109 USA >>>> Office: 168-519, Mailstop: 168-527 >>>> Email: [email protected] >>>> WWW: http://sunset.usc.edu/~mattmann/ >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> ++ >>>> Adjunct Associate Professor, Computer Science Department University >>>> of Southern California, Los Angeles, CA 90089 USA >>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> ++ >>>> >>>> >>>> >>>> >>>> >>>> >>>> > >>>> > >>>> >On Oct 25, 2014, at 1:26 PM, "Zichuan Wang" <[email protected]> wrote: >>>> > >>>> > >>>> >Dear Professor, >>>> > >>>> > >>>> > >>>> >Could please also explain how I can crawl all JSON file name under >>>> >a specific directory using CAS-PGE? I’ll work through this example >>>> >https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+E >>>> >xam >>>> p >>>> >le, but it doesn’t mention anything about crawling, instead it >>>> >manually set the Input files paths... >>>> > >>>> > >>>> > >>>> > >>>> >-- >>>> > >>>> >Zichuan Wang >>>> > >>>> >University of Southern California, Department of Computer Science >>>> > >>>> > >>>> > >>>> > >>>> >在 2014年10月25日 星期六,下午12:10,Zichuan Wang >>>> >写道: >>>> > >>>> >Dear Professor, >>>> > >>>> > >>>> > >>>> >In assignment 2 specification I noticed that you mentioned OODT >>>> >File Manager, but from my understanding, we are using ETLLib poster >>>> >which talks directly to Solr. So how can we use OODT File Manager >>>> >in this assignment? >>>> > >>>> > >>>> > >>>> >-- >>>> > >>>> >Zichuan Wang >>>> > >>>> >University of Southern California, Department of Computer Science >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>>> > >>> >> >> > >
