Hi,

Are there any good resources or does anyone have experience regarding running 
workflows with a very large number of actions?

We're currently using an Oozie install allocated with 4GB of memory connected 
to a postgres database and we're successfully running workflows with hundreds 
of actions. However, we're having trouble scaling up to workflows that contain 
tens of thousands of actions. For example, errors like "E0603: SQL error in 
operation, Ran out of memory retrieving query results" or "E0603: SQL error in 
operation, An I/O error occured while sending to the backend" occur in the 
Oozie logs, but we also see other symptoms like the Oozie console becoming very 
slow and unresponsive.

What are the typical and maximum workflow sizes that people have seen? Both in 
terms of total number of actions in a workflow or the maximum number of actions 
after a fork in a workflow would be useful.

I want to get an idea of whether we're even in the ballpark so that its 
worthwhile looking at tuning the various configuration settings for Oozie or 
whether we're simply too far out to be reasonable.

Thanks!

-- Denis

Reply via email to