[ 
https://issues.apache.org/jira/browse/OOZIE-348?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13101903#comment-13101903
 ] 

Hadoop QA commented on OOZIE-348:
---------------------------------

mislam77 remarked:
So there will be 2 queues. One for coordinator input checks (queue 1) and other 
for the rest of the commands (queue 2).
In this approach, the questions are:
* Will there be 2 threadpools? I assume it will be.

* For queue 2, the re-queuing will still happen. right? Although we don't see 
any problem for other commands at this point, do you think similar situation 
could happen later. Since re-queuing perturbs the original ordering, the queue 
processing will be unfair.Considering this should not we look for other 
approach.

* How does the threadpool size impact the system? The reason is, we would like 
to increase the default thread pool size from 120.

Can we discuss the other approach too? Using queue in DB.

If we want to implement hot-hot or load balancing system (a possible future 
direction), I think DB approach will help that.
In the current approach, the same queue will be created into both system 
(although both might not process the same command) resulting the unnecessary 
overhead of keeping the same element into both queues.

> GH-561: Redesign oozie internal queue
> -------------------------------------
>
>                 Key: OOZIE-348
>                 URL: https://issues.apache.org/jira/browse/OOZIE-348
>             Project: Oozie
>          Issue Type: Bug
>            Reporter: Hadoop QA
>
> We had a lot of issues related to oozie internal queue. It includes queue 
> overflow as well as re-queuing the same overly used commands to avoid 
> starvation. There are other situations too. This problem becomes very obvious 
> in very high-load case.
> I would like to open-up the discussion to find out a better architectural 
> design  for longer term considering a very high-load situation.
> The following proposals are to initiate the discussion that varied from 
> complete overhaul to adjusting the current design:
> 1. Implement the queue idea into DB:
>    Pros: Persistence. In hot-hot or load balancing situation it useful. 
> Single place of truth. Different level of ordering could be done as needed 
> through SQL. Don't bother about queue size. Don't need to recreate in every 
> restart -- recovery service might be less busy.
>   Cons: Extra DB access overhead.
>   Middle approach could be to keep a memory cache with strict conditions. The 
> details could be discussed later.
> 2. Re-queuing the same commands (that is used for throttling) -- should be 
> redesigned. In this case, make sure queuing happens in the *same* place -- 
> not at the end of queue. I know this will break the queue meaning. In this 
> case, we might need to use a different data structure.  
> Currently queuing the same command at the end created starvation ( live-lock) 
>  like situation.
> 3. Multiple queues. One for coordinator input check that is used 99% of time.
> Comments?
> Regards,
> Mohammad

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to