Hi, we have more than 20 running coordinators and more than 60 workflows used in these coordinators. Coordinators materialize on hour/day/week/several weeks manner.
We have 2 major problems: 1. We want general approach for collecting mapreduce (pig) counters. Many our pig UDFs do report counters to JobTracker. We want to collect these coutners after workflow action run and store to some DB (we think about mongo because of its schemaless nature) We don't want to copy-paste code in each workflow. We are looking for some service-wide solution. 2. Some of our coordinators can stop to materialize because of missing data dependecies. One of data provides failed to send data for example. It's ok for coodrinator action to be in WAITING state, but it's not OK for us. We have strict requrements for materialization time (how long it runs) and period (how often it runs) We don't want to copy-paste code in each coordinator, we are looking for some service-wide solution. Can you suggest us someting to see/read/test? Thank you!
