Oozie dev, did anyone get a chance to take a look at this
> On 08-Nov-2013, at 6:08 pm, Shwetha GS <[email protected]> wrote: > > Hi, > > We have seen weird issues with CallableQueueService with oozie 3.3.2. We > couldn't root-cause the exact code causing the issue, so not sure if its > already fixed in 4.0. Any pointers will be helpful: > > Materialisation for a coord just stops. CoordMaterializeTriggerService picks > up that coord at every materialisation interval, but > CoordMaterializeTransitionXCommand doesn't get called. Looks like > CoordMaterializeTransitionXCommand is lost somewhere in the queue. Whenever > this issue happens, the number of coords picked up for materialisation is > 40-50 and we also see this log: > oozie.log-2013-11-08-01:2013-11-08 01:00:12,225 WARN > CallableQueueService:542 - USER[-] GROUP[-] max concurrency for callable > [#composite#coord_mater] exceeded, requeueing with [500]ms delay > Restarting oozie resumes materialization. > > Looks like materialisation batch size is 50, and in callable queue service, > composite callable batch size is set to 10, and max concurrency is 3. So, > when there are more than 30 coords picked for materialisation, the 4th/5th > batch of coords is magically lost somewhere. Code looks fine and don't know > where the leak is. > > Tried re-producing this in local machine by tuning these configs, but > couldn't get anything > > Thanks, > Shwetha > > -- _____________________________________________________________ The information contained in this communication is intended solely for the use of the individual or entity to whom it is addressed and others authorized to receive it. It may contain confidential or legally privileged information. If you are not the intended recipient you are hereby notified that any disclosure, copying, distribution or taking any action in reliance on the contents of this information is strictly prohibited and may be unlawful. If you have received this communication in error, please notify us immediately by responding to this email and then delete it from your system. The firm is neither liable for the proper and complete transmission of the information contained in this communication nor for any delay in its receipt.
