Re: Asynchronous http poll

Didier Mon, 14 May 2018 17:35:51 -0700

Its hard to answer without additional detail.

I'll make some assumptions, and answer assuming those are true:

1) I assume your S1 API is blocking, and that each request to it is handled 
on its own thread, and that those threads come form a fixed size thread 
pool with a size of 30.

2) I assume that S2 is also blocking, and that it returns a promise when 
you call it. And that you need to keep polling another API, that I'll call 
get-s2-result which takes the promise, and is also blocking, and returns 
the result, error, or that its still not available.

3) I assume you want to turn your blocking S1 API, into a pseudo 
non-blocking behavior.

4) Thus, you would have S1 return a promise. When called, you do not 
process the request, but you queue the request in a "to be processed" 
queue, and you return a promise that eventually, the request will be 
processed and will have a value, or an error.

5) Similarly, you need a way for the client to check the promise, thus you 
will also expose a blocking API that I will call get-s1-result which takes 
the promise and returns either the result, an error, or that it's not 
available yet.

6) Your promise will take the form of a GUID that uniquely identifies the 
queued request.

7) This is your APIs design. Your clients can now start work and 
integration with your APIs, while you implement its functionality.

8) Now you need to implement the queuing up of requests. This is where you 
have options, and core.async is one of them. I do agree with the advice of 
not using core.async unless simpler tools don't work. So I will start with 
a simpler tool: Future, and a global atom map from promise GUID to request 
map.

9) So you create a global atom, which contains a map from GUID -> FUTURE.

10) On every request to S1, you create a new GUID and Future, and you swap! 
assoc the GUID with the Future.

11) The Future is your request handler. So in it, you synchronously handle 
the request, whatever that means for you. So maybe you do some processing, 
and then you call S2, and then you loop, and every 100ms, in the loop, you 
call get-s2-result until it returns an error or a result. Every time you 
loop, you check that the time its been since the time you started looping 
is not more then X timeout, so that you don't loop forever. If you 
eventually get a result or an error, you handle them however you need too, 
and eventually your future itself returns a result or an error. Its 
important you design the future task to timeout eventually. So that you 
don't leak futures that get stuck in infinite loops. So you must be able to 
deterministically know that the future will finish.

12) Now you implement get-s1-result. Whenever it is called, you get the 
future from the global atom map of futures, and you call future-done? on 
it. If false, you return that the result is not available yet. If it is 
done, you deref the future, swap! dessoc the mapEntry for it from your 
global atom, and return the result or error.

The only danger of this approach, is that the Future queue is unbounded. So 
what happens is that clients can call S1 and get-s1-result with at most 30 
concurrent request. That's because I assumed your APIs are blocking and 
bounded on a shared fixed thread pool of size 30.

Now say it takes you 1 second to process on average an S1 request, so your 
future will finish on average in 1 second, and you time them out at 5 
seconds. Now say we go for worst case scenario. This means say S2 is down, 
so all requests take the max of 5 seconds to be handled. Now say your 
clients are also maxing out your concurrency for S1, so you get around 30 
concurrent request constantly. Say S1 takes 100ms to return the promise. 
What you get is this:

* Every second, you are creating 300 Future, because every 100ms, you 
process 30 new S1 requests.

So say we are at the beginning, you have 0 Future, one second later, you 
have 300, 5 second later, you have 1500, but your first 300 timeout, so you 
end up with 1200. At the 6th second, you have 1200 again, since 300 more 
were queued, but 300 more timed out, and from this point on, every second 
you have 1200 open Futures, with a max of 1500.

Thus you need to make sure you can handle 1500 open threads on your host.

Indirectly, this stabilizes because you made sure your Future tasks time 
out at 5 second, and because your S1 API is itself bounded to 30 concurrent 
request max.

If, you'd prefer to not rely on the bound of the S1 requests, and you have 
a hard time knowing the timing of your S1, you can keep track of the count 
of queued Future, and on a request to S1 where the count is above your 
bound, you return an error, instead of a promise, asking the client to wait 
a bit, and retry the call in a bit, where you have more resourced available.

I hope this helps.

On Tuesday, 8 May 2018 13:45:00 UTC-7, Brjánn Ljótsson wrote:
>
> Hi!
>
> I'm writing a server-side (S1) function that initiates an action on 
> another server (S2) and regularly checks if the action has finished 
> (failed/completed). It should also be possible for a client to ask S1 for 
> the status of the action performed by S2.
>
> My idea is to create an uid on S1 that represents the action and the uid 
> is returned to the client. S1 then asynchronously polls S2 for action 
> status and updates an atom with the uid as key and status as value. The 
> client can then request the status of the uid from S1. Below is a link to 
> my proof of concept code (without any code for the client requests or 
> timeout guards) - and it is my first try at writing code using core.async.
>
> https://gist.github.com/brjann/d80f1709b3c17ef10a4fc89ae693927f
>
> The code can be tested with for example (start-poll) or (repeatedly 10 
> start-poll) to test the behavior when multiple requests are made.
>
> The code seems to work, but is it a correct use of core.async? One thing 
> I'm wondering is if the init-request and poll functions should use threads 
> instead of go-blocks, since the http requests may take a few hundred 
> milliseconds and many different requests (with different uids) could be 
> made simultaneously. I've read that "long-running" tasks should not be put 
> in go blocks. I haven't figured out how to use threads though.
>
> I would be thankful for any input!
>
> Best wishes,
> Brjánn Ljótsson
>

-- 
You received this message because you are subscribed to the Google
Groups "Clojure" group.
To post to this group, send email to clojure@googlegroups.com
Note that posts from new members are moderated - please be patient with your 
first post.
To unsubscribe from this group, send email to
clojure+unsubscr...@googlegroups.com
For more options, visit this group at
http://groups.google.com/group/clojure?hl=en
--- 
You received this message because you are subscribed to the Google Groups 
"Clojure" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to clojure+unsubscr...@googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Re: Asynchronous http poll

Reply via email to