[ 
https://issues.apache.org/jira/browse/OOZIE-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101902#comment-13101902
 ] 

Hadoop QA commented on OOZIE-348:
---------------------------------

tucu00 remarked:
Well, then the solution would be to use a separate queue exclusively service 
for coordinator input checks. in that case the  threadpool will be the only 
throttling and no concurrency re-queueing would happen.

> GH-561: Redesign oozie internal queue
> -------------------------------------
>
>                 Key: OOZIE-348
>                 URL: https://issues.apache.org/jira/browse/OOZIE-348
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> We had a lot of issues related to oozie internal queue. It includes queue 
> overflow as well as re-queuing the same overly used commands to avoid 
> starvation. There are other situations too. This problem becomes very obvious 
> in very high-load case.
> I would like to open-up the discussion to find out a better architectural 
> design  for longer term considering a very high-load situation.
> The following proposals are to initiate the discussion that varied from 
> complete overhaul to adjusting the current design:
> 1. Implement the queue idea into DB:
>    Pros: Persistence. In hot-hot or load balancing situation it useful. 
> Single place of truth. Different level of ordering could be done as needed 
> through SQL. Don't bother about queue size. Don't need to recreate in every 
> restart -- recovery service might be less busy.
>   Cons: Extra DB access overhead.
>   Middle approach could be to keep a memory cache with strict conditions. The 
> details could be discussed later.
> 2. Re-queuing the same commands (that is used for throttling) -- should be 
> redesigned. In this case, make sure queuing happens in the *same* place -- 
> not at the end of queue. I know this will break the queue meaning. In this 
> case, we might need to use a different data structure.  
> Currently queuing the same command at the end created starvation ( live-lock) 
>  like situation.
> 3. Multiple queues. One for coordinator input check that is used 99% of time.
> Comments?
> Regards,
> Mohammad

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to