> On June 5, 2017, 10:14 p.m., Stephan Erb wrote:
> > LGTM.
> > 
> > Using an async handler should effectively unblock the event bus. If I 
> > understand the context correctly, we will now use an unbounded queue within 
> > netty that will be served by one request handling thread. So we will just 
> > have one outgoing request at the time, correct?
> 
> David McLaughlin wrote:
>     > So we will just have one outgoing request at the time, correct?
>     
>     Quite the contrary. In practice we will have thousands of requests in 
> flight on a single thread, similar to how non-blocking networking stacks like 
> node.js, Tornado and Finagle work. Only one response can be processed at a 
> time, however. So it's important to never CPU-block in callbacks.
> 
> David McLaughlin wrote:
>     s/will/can/
> 
> Stephan Erb wrote:
>     So the tradeoff is that we can now have out-of-order delivery of events? 
> At least to my understanding this has been synchronous and thus in-order 
> before. 
>     
>     If this is the case, please be so kind and add a short note to 
> https://github.com/apache/aurora/blob/master/docs/features/webhooks.md. We 
> should probably also add a note to the release-notes as it can break existing 
> integrations.
> 
> David McLaughlin wrote:
>     The Scheduler's EventBus is multi-threaded and asynchronous (see 
> PubsubEventModule), so there was never a guarantee of ordered delivery of 
> events even if the HTTP client is synchronous. The EventBus is backed by an 
> Executor (see AsyncModule) that defaults to 8 threads. 
>     
>     So the only trade-off is that before if there was latency or the 
> subscribing HTTP server was down then all the EventBus threads could (would?) 
> end up blocked for webhookTimeoutMs waiting for a synchronous HTTP client to 
> return, preventing the rest of the async work in the Scheduler from 
> functioning properly. Now the EventBus thread will be unblocked immediately, 
> and all the webhook handling logic is moved to a different thread pool.

Stephan - can you confirm you're okay with shipping as-is? Or would you like me 
to clear up some of the existing sparse webhook documentation first?


- David


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/59703/#review176961
-----------------------------------------------------------


On June 1, 2017, 6:33 a.m., David McLaughlin wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/59703/
> -----------------------------------------------------------
> 
> (Updated June 1, 2017, 6:33 a.m.)
> 
> 
> Review request for Aurora, Santhosh Kumar Shanmugham, Stephan Erb, and Zameer 
> Manji.
> 
> 
> Bugs: AURORA-1773
>     https://issues.apache.org/jira/browse/AURORA-1773
> 
> 
> Repository: aurora
> 
> 
> Description
> -------
> 
> Current code uses a synchronous HTTP client, which can block the EventBus. 
> Switch to an async HTTP client.
> 
> 
> Diffs
> -----
> 
>   build.gradle 4802d5e552b978338b037326eae85e193a7eb2d1 
>   src/main/java/org/apache/aurora/scheduler/events/Webhook.java 
> 3868779986285ac302d028f8713f683192951b83 
>   src/main/java/org/apache/aurora/scheduler/events/WebhookModule.java 
> 1f10af71830386652d21961b733bd0927c5436a1 
>   src/test/java/org/apache/aurora/scheduler/events/WebhookTest.java 
> e8335d9b78dbf30bf3ae08b6bdd02018cea76f6b 
> 
> 
> Diff: https://reviews.apache.org/r/59703/diff/1/
> 
> 
> Testing
> -------
> 
> Enabled webhooks in Vagrant and verified message received:
> 
> POST / HTTP/1.1
> Timestamp: 1496298562793
> Content-Length: 2570
> Host: 192.168.33.7:5100
> Accept: */*
> User-Agent: AHC/2.0
> 
> {"task":{"cachedHashCode":0,"assignedTask":{"cachedHashCode":0,"taskId":"www-data-devel-hello_world-0-11fb2654-efd7-4411-84d7-ec8e0ed485d9","task":{"cachedHashCode":371132471,"job":{"cachedHashCode":1193662568,"role":"www-data","environment":"devel","name":"hello_world"},"owner":{"cachedHashCode":226895216,"user":"vagrant"},"isService":true,"numCpus":0.0,"ramMb":0,"diskMb":0,"priority":0,"maxTaskFailures":1,"production":false,"tier":"preemptible","resources":[{"cachedHashCode":-367894814,"setField":"NUM_CPUS","value":1.0},{"cachedHashCode":445972424,"setField":"RAM_MB","value":1},{"cachedHashCode":-2018031677,"setField":"DISK_MB","value":8}],"constraints":[],"requestedPorts":[],"mesosFetcherUris":[],"taskLinks":{},"executorConfig":{"cachedHashCode":-1711220736,"name":"AuroraExecutor","data":"{\"environment\":
>  \"devel\", \"health_check_config\": {\"health_checker\": {\"http\": 
> {\"expected_response_code\": 0, \"endpoint\": \"/health\", 
> \"expected_response\": \"ok\"}}, \"min_consecuti
 ve_successes\": 1, \"initial_interval_secs\": 15.0, 
\"max_consecutive_failures\": 0, \"timeout_secs\": 1.0, \"interval_secs\": 
10.0}, \"name\": \"hello_world\", \"service\": true, \"max_task_failures\": 1, 
\"cron_collision_policy\": \"KILL_EXISTING\", \"enable_hooks\": false, 
\"cluster\": \"devcluster\", \"task\": {\"processes\": [{\"daemon\": false, 
\"name\": \"fetch_package\", \"ephemeral\": false, \"max_failures\": 1, 
\"min_duration\": 5, \"cmdline\": \"cp /vagrant/hello_world.py . \u0026\u0026 
echo a146647a4293aef6ae45c6d5699c1f96 \u0026\u0026 chmod +x hello_world.py\", 
\"final\": false}, {\"daemon\": false, \"name\": \"hello_world\", 
\"ephemeral\": false, \"max_failures\": 1, \"min_duration\": 5, \"cmdline\": 
\"python -u hello_world.py\", \"final\": false}], \"name\": \"fetch_package\", 
\"finalization_wait\": 30, \"max_failures\": 1, \"max_concurrency\": 0, 
\"resources\": {\"gpu\": 0, \"disk\": 8388608, \"ra
> m\": 1048576, \"cpu\": 1.0}, \"constraints\": [{\"order\": 
> [\"fetch_package\", \"hello_world\"]}]}, \"production\": false, \"role\": 
> \"www-data\", \"tier\": \"preemptible\", \"lifecycle\": {\"http\": 
> {\"graceful_shutdown_endpoint\": \"/quitquitquit\", \"port\": \"health\", 
> \"shutdown_endpoint\": \"/abortabortabort\"}}, \"priority\": 
> 0}"},"metadata":[],"container":{"cachedHashCode":-270656463,"setField":"MESOS","value":{"cachedHashCode":962,"volumes":[]}}},"assignedPorts":{},"instanceId":0},"status":"PENDING","failureCount":0,"taskEvents":[{"cachedHashCode":0,"timestamp":1496298562764,"status":"PENDING","scheduler":"aurora"}]},"oldState":{}}
> 
> 
> Thanks,
> 
> David McLaughlin
> 
>

Reply via email to