[ 
https://issues.apache.org/jira/browse/AURORA-1769?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15484533#comment-15484533
 ] 

Maxim Khutornenko commented on AURORA-1769:
-------------------------------------------

My suggestion was targeting the restart issue where events should be suppressed 
regardless: you don't want to resend {{TaskStateChange}} events for all tasks 
every time a scheduler restarts.

As for the general perf issue, blocking {{EventBus}} threads was one of the 
concerns raised in the original https://reviews.apache.org/r/47440/ RB. We 
concluded back then that using aggressive connection timeouts _was_ appropriate 
to mitigate possible event queue saturation. If you feel that is no longer the 
case, please follow up with an async proposal. You'll likely need something 
akin the [BatchWorker|https://reviews.apache.org/r/51759/] sending thread 
working off of its own queue. In any case, given this feature is optional and 
off by default I feel blocking the release until it's improved is not justified.

> Enabling webhook is synchronous and could cause longer leader reelection cycle
> ------------------------------------------------------------------------------
>
>                 Key: AURORA-1769
>                 URL: https://issues.apache.org/jira/browse/AURORA-1769
>             Project: Aurora
>          Issue Type: Bug
>            Reporter: Dmitriy Shirchenko
>            Assignee: Dmitriy Shirchenko
>
> We had an issue where on scheduler leader reelection EventBus was full of 
> TaskStateChange events and caused scheduler to not be able to post 
> DriverRegistered() message which caused Aurora scheduler to not register 
> within 1 minute. 



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to