woot thanks ------------------------ Chris Mattmann [email protected]
-----Original Message----- From: Zichuan Wang <[email protected]> Reply-To: <[email protected]> Date: Wednesday, November 5, 2014 at 11:22 PM To: Chris Mattmann <[email protected]> Cc: Chris Mattmann <[email protected]>, <[email protected]>, Luke liu <[email protected]>, <[email protected]>, <[email protected]> Subject: Re: re: Question about OODT file manager >Googled around and find this little trick: > >export JAVA_OPTS=-Xmx2048m > > >It works now, thanks professor! > > >— >Zichuan Wang >Department of Computer Science, USC > >On Wed, Nov 5, 2014 at 10:40 PM, Mattmann, Chris A (3980) ><[email protected]> wrote: > >> Got it. Can you increase the heap space on your batch stub? That >> should take care of it. >> Cheers, >> Chris >> P.S. Great work! >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Chris Mattmann, Ph.D. >> Chief Architect >> Instrument Software and Science Data Systems Section (398) >> NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA >> Office: 168-519, Mailstop: 168-527 >> Email: [email protected] >> WWW: http://sunset.usc.edu/~mattmann/ >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> Adjunct Associate Professor, Computer Science Department >> University of Southern California, Los Angeles, CA 90089 USA >> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >> -----Original Message----- >> From: Zichuan Wang <[email protected]> >> Date: Wednesday, November 5, 2014 at 11:12 PM >> To: Chris Mattmann <[email protected]> >> Cc: Chris Mattmann <[email protected]>, >>"[email protected]" >> <[email protected]>, Luke liu <[email protected]>, "[email protected]" >> <[email protected]>, "[email protected]" <[email protected]> >> Subject: Re: re: Question about OODT file manager >>>Dear Professor, >>> >>> >>>I finally figured out how to trigger a post ingest event. However when I >>>try to crawl the whole dataset, I got an OutOfMemory Error. Could you >>>please take a look and maybe give some suggestions? >>> >>> >>>➜ bin ./crawler_launcher \ >>>--operation --launchAutoCrawler \ >>>--filemgrUrl http://localhost:9000 \ >>>--clientTransferer >>>org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory \ >>>--productPath /Users/zichuanwang/Downloads/output \ >>>--mimeExtractorRepo ../policy/mime-extractor-map.xml \ >>>--workflowMgrUrl http://localhost:9200 \ >>>-ais TriggerPostIngestWorkflow >>>Setting property 'AutoDetectProductCrawler.mimeExtractorRepo' >>>Setting property 'StdProductCrawler.clientTransferer' >>>Setting property 'MetExtractorProductCrawler.clientTransferer' >>>Setting property 'AutoDetectProductCrawler.clientTransferer' >>>Setting property 'StdProductCrawler.filemgrUrl' >>>Setting property 'MetExtractorProductCrawler.filemgrUrl' >>>Setting property 'AutoDetectProductCrawler.filemgrUrl' >>>Setting property 'TriggerPostIngestWorkflow.workflowMgrUrl' >>>Setting property 'StdProductCrawler.actionIds' >>>Setting property 'MetExtractorProductCrawler.actionIds' >>>Setting property 'AutoDetectProductCrawler.actionIds' >>>Setting property 'StdProductCrawler.productPath' >>>Setting property 'MetExtractorProductCrawler.productPath' >>>Setting property 'AutoDetectProductCrawler.productPath' >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'StdProductCrawler.productPath' set to value >>>[/Users/zichuanwang/Downloads/output] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'TriggerPostIngestWorkflow.workflowMgrUrl' set to value >>>[http://localhost:9200] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'AutoDetectProductCrawler.mimeExtractorRepo' set to value >>>[../policy/mime-extractor-map.xml] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'MetExtractorProductCrawler.clientTransferer' set to value >>>[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'AutoDetectProductCrawler.filemgrUrl' set to value >>>[http://localhost:9000] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'AutoDetectProductCrawler.clientTransferer' set to value >>>[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'MetExtractorProductCrawler.actionIds' set to value >>>[TriggerPostIngestWorkflow] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'StdProductCrawler.actionIds' set to value >>>[TriggerPostIngestWorkflow] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'StdProductCrawler.filemgrUrl' set to value >>>[http://localhost:9000] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'AutoDetectProductCrawler.actionIds' set to value >>>[TriggerPostIngestWorkflow] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'AutoDetectProductCrawler.productPath' set to value >>>[/Users/zichuanwang/Downloads/output] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'MetExtractorProductCrawler.filemgrUrl' set to value >>>[http://localhost:9000] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'StdProductCrawler.clientTransferer' set to value >>>[org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory] >>>Nov 5, 2014 10:07:47 PM >>>org.springframework.beans.factory.config.PropertyOverrideConfigurer >>>processKey >>>: Property 'MetExtractorProductCrawler.productPath' set to value >>>[/Users/zichuanwang/Downloads/output] >>>Nov 5, 2014 10:07:47 PM org.apache.oodt.cas.crawl.ProductCrawler crawl >>>Ϣ: Crawling /Users/zichuanwang/Downloads/output >>>Exception in thread "main" java.lang.OutOfMemoryError: Java heap space >>>at java.io.UnixFileSystem.list(Native Method) >>>at java.io.File.list(File.java:973) >>>at java.io.File.listFiles(File.java:1129) >>>at >>>org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:104) >>>at >>>org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:75) >>>at >>>org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(Cr >>>aw >>>lerLauncherCliAction.java:58) >>>at >>>org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331) >>>at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:187) >>>at >>>org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:36) >>> >>> >>>— >>>Zichuan Wang >>>Department of Computer Science, USC >>> >>> >>>On Wed, Nov 5, 2014 at 6:42 PM, Christian Alan Mattmann >>><[email protected]> wrote: >>> >>> >>>Thanks Luke, I’ve given you permissions so you should now see an >>>“edit” button on that wiki page. >>> >>>Cheers, >>>Chris >>> >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>Chris Mattmann, Ph.D. >>>Adjunct Associate Professor, Computer Science Department >>>University of Southern California >>>Los Angeles, CA 90089 USA >>>Email: [email protected] >>>WWW: http://sunset.usc.edu/~mattmann/ >>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>> >>> >>> >>> >>>-----Original Message----- >>>From: Luke liu <[email protected]> >>>Date: Wednesday, November 5, 2014 at 6:48 PM >>>To: Chris Mattmann <[email protected]>, >>>"[email protected]" >>><[email protected]> >>>Cc: Chris Mattmann <[email protected]>, "[email protected]" >>><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan >>>Wang' >>><[email protected]> >>>Subject: RE: re: Question about OODT file manager >>> >>>>I just signed up on the wiki(i.e. https://cwiki.apache.org ) with the >>>>following account detail: >>>> Account name: luke >>>> Full Name: Shuai Liu (Luke) >>>> Email: [email protected] >>>> Password: ******* >>>> >>>>But I am not sure where I can add my notes to the following web article >>>>with >>>>which I had trouble , I also tried to create a new article, but failed >>>>to >>>>do >>>>it as I cannot find a place where I can edit, does this have something >>>>do >>>>with my account that is not visible for the "edit" or "comments" >>>>action? >>>>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Examp >>>>le >>>> >>>> >>>> >>>>Thanks >>>>Luke >>>>-----Original Message----- >>>>From: Mattmann, Chris A (3980) [mailto:[email protected]] >>>>Sent: Sunday, November 2, 2014 6:59 AM >>>>To: Luke liu; [email protected] >>>>Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; >>>>'Zichuan >>>>Wang' >>>>Subject: Re: re: Question about OODT file manager >>>> >>>>Yes Luke, making the instructions better would be much appreciated! >>>> >>>>If you have an account on the wiki please share it, else sign up for an >>>>Apache OODT wiki account and please share it with me or anyone else on >>>>dev@oodt and we’ll add you. >>>> >>>>Cheers, >>>>Chris >>>> >>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>Chris Mattmann, Ph.D. >>>>Chief Architect >>>>Instrument Software and Science Data Systems Section (398) NASA Jet >>>>Propulsion Laboratory Pasadena, CA 91109 USA >>>>Office: 168-519, Mailstop: 168-527 >>>>Email: [email protected] >>>>WWW: http://sunset.usc.edu/~mattmann/ >>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>Adjunct Associate Professor, Computer Science Department University of >>>>Southern California, Los Angeles, CA 90089 USA >>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>> >>>> >>>> >>>> >>>> >>>> >>>>-----Original Message----- >>>>From: Luke liu <[email protected]> >>>>Date: Sunday, November 2, 2014 at 1:32 AM >>>>To: Chris Mattmann <[email protected]>, >>>>"[email protected]" >>>><[email protected]> >>>>Cc: Chris Mattmann <[email protected]>, "[email protected]" >>>><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan >>>>Wang' >>>><[email protected]> >>>>Subject: RE: re: Question about OODT file manager >>>> >>>>>Thanks Professor Mattmann, not running batch_stub was the main culprit >>>>>and there were some other issues such as missing jars; and sorry for >>>>>not confirming this right away, my laptop was actually crashing, and i >>>>>just had time to fix it; BTW, I was able to get the cas-pge example to >>>>>work, (even though I saw the workflow failed to pass the pre-condition >>>>>in the log, the combined file and some metadata files (i.e.3 files) >>>>>were still successfully ingested and placed in the output directory) >>>>> >>>>>BTW, i think there are a lot of mistakes in the documents, do you want >>>>>us to help correct the document(i.e. >>>>>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Exam >>>>>p >>>>>le >>>>>)? >>>>>If possible, I would like to please share my notes with some problem >>>>>steps mentioned there. >>>>> >>>>>Anyway, thanks for your help and appreciated. >>>>> >>>>>Thanks >>>>>Luke >>>>>-----Original Message----- >>>>>From: Mattmann, Chris A (3980) [mailto:[email protected]] >>>>>Sent: Saturday, November 1, 2014 10:48 AM >>>>>To: Luke; [email protected] >>>>>Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; >>>>>'Zichuan Wang' >>>>>Subject: Re: re: Question about OODT file manager >>>>> >>>>>Dear Luke, just confirming, we solved this in class right? It had to >>>>>do >>>>>with the batch stub not being turned on. >>>>> >>>>>Cheers, >>>>>Chris >>>>> >>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>Chris Mattmann, Ph.D. >>>>>Chief Architect >>>>>Instrument Software and Science Data Systems Section (398) NASA Jet >>>>>Propulsion Laboratory Pasadena, CA 91109 USA >>>>>Office: 168-519, Mailstop: 168-527 >>>>>Email: [email protected] >>>>>WWW: http://sunset.usc.edu/~mattmann/ >>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>Adjunct Associate Professor, Computer Science Department University of >>>>>Southern California, Los Angeles, CA 90089 USA >>>>>++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>> >>>>> >>>>> >>>>> >>>>> >>>>> >>>>>-----Original Message----- >>>>>From: Luke <[email protected]> >>>>>Date: Tuesday, October 28, 2014 at 12:52 PM >>>>>To: Chris Mattmann <[email protected]>, >>>>>"[email protected]" >>>>><[email protected]> >>>>>Cc: Chris Mattmann <[email protected]>, "[email protected]" >>>>><[email protected]>, "[email protected]" <[email protected]>, 'Zichuan >>>>>Wang' >>>>><[email protected]> >>>>>Subject: RE: re: Question about OODT file manager >>>>> >>>>>>Dear Professor Mattamnn, >>>>>>Thanks a lot Professor Mattmann for the kind help, it is appreciated, >>>>>>sorry for getting back to you with my appreciation, I have been >>>>>>conducting tests with OODT based on your advice, but unfortunately I >>>>>>am having another problem.... >>>>>> >>>>>>I am following the steps >>>>>>(https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+Ex >>>>>>a >>>>>>mpl >>>>>>e >>>>>>) to get a sense of how to get workflow to work. >>>>>>The problem is that the File-Concatenator-PGE (by running the >>>>>>wmgr-client >>>>>>command-line) does not seems to be invoked or executed, but I am >>>>>>seeing the tasks are getting stacked up in the workflow manager with >>>>>>status either "RSUBMIT" or "QUEUED", but they are not getting >>>>>>executed, >>>>PFA: >>>>>>workflow_monitor.jpg, please note, by default the workflow min pool >>>>>>size is 6; so here comes another problem, i have 6 submitted tasks >>>>>>with status RSUBMIT, but any new incoming tasks will be forwarded to >>>>>>the waiting QUEUE with status "QUEUED"...please refer to the >>>>>>workflow_monitor.jpg for details, where I have 3 QUEUED workflow task >>>>>>and >>>>6 RSUMBITE tasks. >>>>>> >>>>>>Question 1): not sure why the workflow is not being executed, and >>>>>>hanging at the state of "RSUBMIT", after enabling the log level, I am >>>>>>seeing the following entry in the log, not sure if this has anything >>>>>>to do with the "hanging" problem where workflow is not getting >>>>>>executed and hanging at state of "RSUBMIT". >>>>>> Oct 28, 2014 3:35:07 AM >>>>>>org.apache.oodt.cas.workflow.engine.IterativeWorkflowProcessorThread >>>>>>safeCheckJobComplete >>>>>> WARNING: Exception checking completion status for job: >>>>>>[2014-10-28T01:59:32.813-07:00]: Messsage: java.lang.Exception: >>>>>>java.lang.NullPointerException >>>>>> >>>>>>Question 2): I think currently on my side any new incoming workflow >>>>>>task I am sending with the following command is being directed to the >>>>>>waiting "QUEUE" because of the min pool size (i.e. 6) (I can increase >>>>>>this to a larger number though), >>>>>> ./wmgr-client --url http://localhost:9200 >>>>>--operation --sendEvent >>>>>>--eventName fileconcatenator-pge --metaData --key RunID testNumber1 >>>>>> If possible, I would like to please know if there is a way we can >>>>>purge >>>>>>the queue and get rid of those workflow tasks either in "RSUMBIT" and >>>>>>"QUEUED" I have already sent, please kindly help. >>>>>> >>>>>>Very sorry for troubling you with this, to be honest I find OODT a >>>>>>bit >>>>>>challenging to grasp within a short time frame, probably because >>>>>>there >>>>>>is no book like OODT in action like Solr.... and what I am doing is >>>>>>just trial and error blended with guess, but I don’t want to make a >>>>>>blind guess, it will be appreciated if you can please also shed some >>>>>>lights on where I can get more information logging or other way where >>>>>>I can troubleshoot. I think it might be worth tracking what is >>>>>>happening when workflow reach the status "RSUBMIT" and how to get a >>>>>>specific logging info specific to it... >>>>>> >>>>>>Again your advice and kind help will be appreciated usual. >>>>>> >>>>>> >>>>>>Thanks >>>>>>Luke >>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Mattmann, Chris A (3980) >>>>>>> [mailto:[email protected]] >>>>>>> Sent: 2014年10月26日 22:18 >>>>>>> To: Luke; 'Zichuan Wang' >>>>>>> Cc: 'Christian Alan Mattmann'; [email protected]; [email protected]; >>>>>>> [email protected] >>>>>>> Subject: Re: re: Question about OODT file manager >>>>>>> >>>>>>> Hi Luke, >>>>>>> >>>>>>> Thanks and sorry it’s taken me a while to reply. Here are some >>>>>>>details >>>>>>>below: >>>>>>> >>>>>>> >>>>>>> -----Original Message----- >>>>>>> From: Luke <[email protected]> >>>>>>> Date: Sunday, October 26, 2014 at 6:19 PM >>>>>>> To: Chris Mattmann <[email protected]>, 'Zichuan Wang' >>>>>>> <[email protected]> >>>>>>> Cc: Chris Mattmann <[email protected]>, "[email protected]" >>>>>>> <[email protected]>, "[email protected]" <[email protected]>, >>>>>>> "[email protected]" <[email protected]> >>>>>>> Subject: RE: re: Question about OODT file manager >>>>>>> >>>>>>> >Hi Professor Mattmann and OODT DEV, >>>>>>> > >>>>>>> >Sorry to trouble you with this email, our team has been struggling >>>>>>> >in the oodt to send json files to solr. >>>>>>> >One of the difficulties is still getting OODT workflow to call the >>>>>>> >poster.py in etllib. >>>>>>> >>>>>>> Sorry that you’re having difficulty let me try and help. >>>>>>> >>>>>>> > >>>>>>> >I am not sure if my understanding is correct with OODT >>>>>>>requirement, >>>>>>> >I hope you can please kindly advice and help with our confusion. >>>>>>> > >>>>>>> >a set of goals in my mind with OODT is as follows, please kindly >>>>>>> >confirm and clarify: >>>>>>> > >>>>>>> >1) >>>>>>> >Get the File-Manager up and running. >>>>>>> >>>>>>> Yep, hopefully as installed via OODT RADIX. >>>>>>> >>>>>>> >2) >>>>>>> >send all json files with command wmgr-client to the fileManager >>>>>>>server. >>>>>>> >(I believe we can achieve it with a bash script or probably python >>>>>>> >that calls the command line sequentially with each json file name >>>>>>> >as >>>>>>>an >>>>>>> >argument?!) >>>>>>> >>>>>>> Suggestion: >>>>>>> >>>>>>> 1. Use the OODT crawler and file manager to crawl/index the JSON >>>>>>>files (in place data transfer). >>>>>>> 2. Take a look at CAS-PGE, it will help you write a workflow task >>>>>>>that will wrap ETLlib and the poster command. >>>>>>> 3. Once you are confident with #2, whip up a script that pages >>>>>>>through all of your indexed JSON files, and then for each one, >>>>>>>submits a workflow event (you may need to look into aggregating >>>>>>>them) that calls your CAS-PGE wrapped poster task from ETLlib. >>>>>>> >>>>>>> >3) >>>>>>> >Once we have json files sent and stored in the File-Manager, we >>>>>>> >need >>>>>>>to >>>>>>> >get workflow-manager up and running, and we can create a workflow >>>>>>>that >>>>>>> >send those jsons file from the file manager to solr. >>>>>>> >>>>>>> See above. >>>>>>> >>>>>>> >4) >>>>>>> >Create a workflow according to >>>>>>> >Workflow2 User Guide >>>>>>> >>>>>>>><https://cwiki.apache.org/confluence/display/OODT/Workflow2+User+Gu >>>>>>>>i >>>>>>>>de> >>>>>>> >>>>>>>>>>> here comes the problem….. >>>>>>> > I am not sure how to create a workflow task which can call >>>>>>>the >>>>>>> >poster.py in python etllib, it looks like we need to create our >>>>>>>own >>>>>>> >java class that extend <TaskInstance> which is an abstract Java >>>>>>> >class with one abstract method that has the following signature: >>>>>>> > >>>>>>> > >>>>>>> >protectedabstract ResultsState performExecution(ControlMetadata >>>>>>> >crtlMetadata); >>>>>>> > However, the detail of where to find the corresponding >>>>>>> >libs and where to put our implementation in workflow manager is >>>>>>> >being neglected in that page. I am not sure if we should use >>>>>>> >TaskInstance, but it seems the workflow has to have an interface >>>>>>> >thru which it can call the python code i.e. poster.py. and it >>>>>>>looks >>>>>>> >like we need to embody the TaskInstance::performExecution by >>>>>>> >injecting the code that calls the poster.py and return the >>>>resultState. >>>>>>> > >>>>>>> > >>>>>>> >It would be greatly appreciated if you could please shed some >>>>>>> >lights and advice how we can get a task instance to call the >>>>>>> >poster.py. BTW, >>>>>>>I >>>>>>> >am also not sure if my understanding is correct, please kindly >>>>>>>correct >>>>>>> >it if inappropriate. Your help will be appreciated as usual. >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >Thanks >>>>>>> >Luke >>>>>>> >>>>>>> Thanks Luke, see above. Let me know if it helps. >>>>>>> >>>>>>> Cheers! >>>>>>> >>>>>>> Chris >>>>>>> >>>>>>> > >>>>>>> >From: Mattmann, Chris A (3980) >>>>>>> >[mailto:[email protected]] >>>>>>> > >>>>>>> >Sent: 2014年10月25日 >>>>>>> > 13:34 >>>>>>> >To: Zichuan Wang >>>>>>> >Cc: Christian Alan Mattmann; Luke; [email protected]; >>>>>>> >[email protected] >>>>>>> >Subject: Re: 回复: Question about OODT file manager >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >Please cc >>>>>>> >[email protected] <mailto:[email protected]> I will reply in >>>>>>>detail >>>>>>> >soon >>>>>>> > >>>>>>> >Sent from my iPhone >>>>>>> >>>>>>> >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> ++ >>>>>>> Chris Mattmann, Ph.D. >>>>>>> Chief Architect >>>>>>> Instrument Software and Science Data Systems Section (398) NASA Jet >>>>>>> Propulsion Laboratory Pasadena, CA 91109 USA >>>>>>> Office: 168-519, Mailstop: 168-527 >>>>>>> Email: [email protected] >>>>>>> WWW: http://sunset.usc.edu/~mattmann/ >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> ++ >>>>>>> Adjunct Associate Professor, Computer Science Department University >>>>>>> of Southern California, Los Angeles, CA 90089 USA >>>>>>> ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ >>>>>>> ++ >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> >>>>>>> > >>>>>>> > >>>>>>> >On Oct 25, 2014, at 1:26 PM, "Zichuan Wang" <[email protected]> >>>>>>>wrote: >>>>>>> > >>>>>>> > >>>>>>> >Dear Professor, >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >Could please also explain how I can crawl all JSON file name under >>>>>>> >a specific directory using CAS-PGE? I’ll work through this example >>>>>>> >>>>>>>>https://cwiki.apache.org/confluence/display/OODT/CAS-PGE+Learn+by+E >>>>>>> >xam >>>>>>> p >>>>>>> >le, but it doesn’t mention anything about crawling, instead it >>>>>>> >manually set the Input files paths... >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >-- >>>>>>> > >>>>>>> >Zichuan Wang >>>>>>> > >>>>>>> >University of Southern California, Department of Computer Science >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >在 2014年10月25日 星期六,下午12:10,Zichuan Wang >>>>>>> >写道: >>>>>>> > >>>>>>> >Dear Professor, >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >In assignment 2 specification I noticed that you mentioned OODT >>>>>>> >File Manager, but from my understanding, we are using ETLLib >>>>>>>poster >>>>>>> >which talks directly to Solr. So how can we use OODT File Manager >>>>>>> >in this assignment? >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> >-- >>>>>>> > >>>>>>> >Zichuan Wang >>>>>>> > >>>>>>> >University of Southern California, Department of Computer Science >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>>> > >>>>>> >>>>> >>>>> >>>> >>>> >>> >>> >>> >>> >>>
