GH-561: Redesign oozie internal queue
-------------------------------------

                 Key: OOZIE-348
                 URL: https://issues.apache.org/jira/browse/OOZIE-348
             Project: Oozie
          Issue Type: Bug
            Reporter: Hadoop QA


We had a lot of issues related to oozie internal queue. It includes queue 
overflow as well as re-queuing the same overly used commands to avoid 
starvation. There are other situations too. This problem becomes very obvious 
in very high-load case.


I would like to open-up the discussion to find out a better architectural 
design  for longer term considering a very high-load situation.

The following proposals are to initiate the discussion that varied from 
complete overhaul to adjusting the current design:

1. Implement the queue idea into DB:
   Pros: Persistence. In hot-hot or load balancing situation it useful. Single 
place of truth. Different level of ordering could be done as needed through 
SQL. Don't bother about queue size. Don't need to recreate in every restart -- 
recovery service might be less busy.

  Cons: Extra DB access overhead.

  Middle approach could be to keep a memory cache with strict conditions. The 
details could be discussed later.

2. Re-queuing the same commands (that is used for throttling) -- should be 
redesigned. In this case, make sure queuing happens in the *same* place -- not 
at the end of queue. I know this will break the queue meaning. In this case, we 
might need to use a different data structure.  

Currently queuing the same command at the end created starvation ( live-lock)  
like situation.

3. Multiple queues. One for coordinator input check that is used 99% of time.

Comments?

Regards,
Mohammad

--
This message is automatically generated by JIRA.
For more information on JIRA, see: http://www.atlassian.com/software/jira

        

Reply via email to