Hi,

1) For making Oozie store the pig counters, you need to specify the
following property in your workflow definition.
<property>
<name>oozie.action.external.stats.write</name>
<value>true</value>
</property>
Have a look at 
http://oozie.apache.org/docs/3.2.0-incubating/WorkflowFunctionalSpec.html#a
4.2.5_Hadoop_EL_Functions for more details.
You can also retrieve this stats/counters through wf action info API
(oozie job -info <actionId> -verbose>

2) You can have multiple coordinators waiting for dependencies by
increasing the throttle value. By default, throttle is set to 12 so only
12 coordinators can be in WAITING state at any time.
For more details, have a look at
http://oozie.apache.org/docs/3.1.3-incubating/CoordinatorFunctionalSpec.htm
l#a6.1.6._Coordinator_Action_Execution_Policies

Thanks,
Virag   

On 9/9/13 5:15 AM, "Serega Sheypak" <[email protected]> wrote:

>Hi, we have more than 20 running coordinators and more than 60 workflows
>used in these coordinators.
>Coordinators materialize on hour/day/week/several weeks manner.
>
>We have 2 major problems:
>1. We want general approach for collecting mapreduce (pig) counters. Many
>our pig UDFs do report counters to JobTracker. We want to collect these
>coutners after workflow action run and store to some DB (we think about
>mongo because of its schemaless nature)
>We don't want to copy-paste code in each workflow. We are looking for some
>service-wide solution.
>
>2. Some of our coordinators can stop to materialize because of missing
>data
>dependecies. One of data provides failed to send data for example. It's ok
>for coodrinator action to be in WAITING state, but it's not OK for us. We
>have strict requrements for materialization time (how long it runs) and
>period (how often it runs)
>We don't want to copy-paste code in each coordinator, we are looking for
>some service-wide solution.
>
>Can you suggest us someting to see/read/test?
>
>Thank you!

Reply via email to