[
https://issues.apache.org/jira/browse/OOZIE-348?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Roman Shaposhnik reopened OOZIE-348:
------------------------------------
> GH-561: Redesign oozie internal queue
> -------------------------------------
>
> Key: OOZIE-348
> URL: https://issues.apache.org/jira/browse/OOZIE-348
> Project: Oozie
> Issue Type: Bug
> Reporter: Hadoop QA
>
> We had a lot of issues related to oozie internal queue. It includes queue
> overflow as well as re-queuing the same overly used commands to avoid
> starvation. There are other situations too. This problem becomes very obvious
> in very high-load case.
> I would like to open-up the discussion to find out a better architectural
> design for longer term considering a very high-load situation.
> The following proposals are to initiate the discussion that varied from
> complete overhaul to adjusting the current design:
> 1. Implement the queue idea into DB:
> Pros: Persistence. In hot-hot or load balancing situation it useful.
> Single place of truth. Different level of ordering could be done as needed
> through SQL. Don't bother about queue size. Don't need to recreate in every
> restart -- recovery service might be less busy.
> Cons: Extra DB access overhead.
> Middle approach could be to keep a memory cache with strict conditions. The
> details could be discussed later.
> 2. Re-queuing the same commands (that is used for throttling) -- should be
> redesigned. In this case, make sure queuing happens in the *same* place --
> not at the end of queue. I know this will break the queue meaning. In this
> case, we might need to use a different data structure.
> Currently queuing the same command at the end created starvation ( live-lock)
> like situation.
> 3. Multiple queues. One for coordinator input check that is used 99% of time.
> Comments?
> Regards,
> Mohammad
--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira