Probably a good idea to read this too:

https://cwiki.apache.org/confluence/display/OODT/Understanding+the+flow+of+Metadata+during+PGE+based+Processing
 

https://cwiki.apache.org/confluence/display/OODT/Understanding+CAS-PGE+Metadata+Precendence


 

From: Lewis John McGibbney <lewi...@apache.org>
Reply-To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Date: Tuesday, December 4, 2018 at 8:40 AM
To: "dev@oodt.apache.org" <dev@oodt.apache.org>
Subject: Re: Updating Workflow Status to Post-Ingest

 

The current state of my PGE and Workflow policy can be seen at

https://github.com/capstone-coal/coal-sds/tree/master/workflow/src/main/resources/policy

 

On 2018/12/04 05:11:02, lewis john mcgibbney <lewi...@apache.org> wrote: 

Hi Folks,

Whilst executing the following command

./crawler/bin/crawler_launcher \

   --filemgrUrl http://localhost:9000 \

   --operation --launchMetCrawler \

   --clientTransferer

org.apache.oodt.cas.filemgr.datatransfer.LocalDataTransferFactory \

   --productPath /usr/local/coal-sds-deploy/data/staging \

   --metExtractor

org.apache.oodt.cas.metadata.extractors.TikaCmdLineMetExtractor \

   --metExtractorConfig /usr/local/coal-sds-deploy/data/met/tika.conf \

   --failureDir /usr/local/coal-sds-deploy/data/failure \

   --daemonPort 9003 \

   --daemonWait 2 \

   --successDir /usr/local/coal-sds-deploy/data/archive \

   --actionIds DeleteDataFile UpdateWorkflowStatusToIngest \

   --workflowMgrUrl http://localhost:9001

As you can see, I am trying to kick off a workflow post a successful file

ingestion task. The error I'm getting is as follows

INFO: Performing action (id = UpdateWorkflowStatusToIngest : description =

Triggers workflow event with the name [ProductType]Ingest)

21:00:22.537 [main] DEBUG

org.apache.oodt.cas.workflow.system.rpc.RpcCommunicationFactory - Using

workflow manager client factory : class

org.apache.oodt.cas.workflow.system.rpc.AvroRpcWorkflowManagerFactory

21:00:22.549 [main] INFO

org.apache.oodt.cas.workflow.system.AvroRpcWorkflowManagerClient - Client

created successfully for workflow manager URL: http://localhost:9001

Dec 03, 2018 9:00:22 PM org.apache.oodt.cas.crawl.ProductCrawler

performProductCrawlerActions

WARNING: Failed to perform crawler action : Action (id =

UpdateWorkflowStatusToIngest : description = Triggers workflow event with

the name [ProductType]Ingest) returned false

java.lang.Exception: Action (id = UpdateWorkflowStatusToIngest :

description = Triggers workflow event with the name [ProductType]Ingest)

returned false

     at

org.apache.oodt.cas.crawl.ProductCrawler.performProductCrawlerActions(ProductCrawler.java:362)

     at

org.apache.oodt.cas.crawl.ProductCrawler.performPostIngestOnSuccessActions(ProductCrawler.java:334)

     at

org.apache.oodt.cas.crawl.ProductCrawler.handleFile(ProductCrawler.java:198)

     at

org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:109)

     at

org.apache.oodt.cas.crawl.ProductCrawler.crawl(ProductCrawler.java:76)

     at

org.apache.oodt.cas.crawl.daemon.CrawlDaemon.startCrawling(CrawlDaemon.java:84)

     at

org.apache.oodt.cas.crawl.cli.action.CrawlerLauncherCliAction.execute(CrawlerLauncherCliAction.java:56)

     at

org.apache.oodt.cas.cli.CmdLineUtility.execute(CmdLineUtility.java:331)

     at org.apache.oodt.cas.cli.CmdLineUtility.run(CmdLineUtility.java:188)

     at

org.apache.oodt.cas.crawl.CrawlerLauncher.main(CrawlerLauncher.java:37)

I have configured workflow manager and have a PGE named 'pycoal-pge' which

includes several tasks. I am just not sure how to reference it from the

crawler_launcher input parameters.

Any ideas? Thanks in advance,

Lewis

-- 

http://home.apache.org/~lewismc/

http://people.apache.org/keys/committer/lewismc

 

Reply via email to