Re: Asynchronous http poll
I think I want to simplify some things. Normally, client/server async is implemented by the client/server framework. What happens is the interchange of messages between the client and server through the http connection on the socket is made non-blocking. But the entire request/response still happens within the context of a single http connection. This often allows for the server to take in many more requests, and for the client to overlap other processing while waiting. That's because open IO operations are cheaper then open threads. So modern hardware support many more concurrent open connections on a socket then it does open threads. If that's what you want, you need to move to a different client/server framework that supports non-blocking exchanges, such as Netty in the Java world. If you want to avoid moving framework, or your operations are going to be really long, or you want to survive network drop outs. Then you can go for something more like what you were trying to go for. In that case, you need to choose between push and pull. If pull, you want a distributed map as I said, which is often known as a database. A sql table can do, a nosql key/value store also works. I'm a fan of DynamoDB for this. Ideally, you want your distributed map to have equal scaling capability as your server APIs will, otherwise it will become a bottleneck. You can also go with a distributed queue, like RabbitMQ or AWS SQS. This allows the client to use a reactor evented response handling. Instead of having the client poll your get API to know if a response is availaible. You will put a message on the queue saying GUID-X is now done. And your client will work through the queue, and for every msg in it, it will poll you for the result. If you want Push, you need the client to expose an endpoint to be contacted on when done. You can do this easily with AWS SNS for example. This could mean the client exposes a call-when-done(guid, result) API. It tells your server about it, and when you are done, you send a request to that API to notify the client. It allows the client to know right away that its done, and saves it the CPU work of having to poll. But it gets complicated if you fail to reach the client's endpoint, what happens? So with push, you can often miss a response. To avoid that, people often offer both pull and push. So my practical recommendation to you would be to first look into a non-blocking client/server framework like Netty. Maybe that's all you need. If not, then look into using DynamoDB or SQS or SNS or similar products. -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Asynchronous http poll
Thank you so much Didier for your detailed response! I will need some time to digest it but a lot of what you write sounds very reasonable. Thanks! Brjánn On 15 May 2018 at 02:57, Didierwrote: > Oh, I forgot something important. > > If you're hoping to have multiple hosts, and run this application in a > distributed way, you really should not do this that way. Things get a lot > more complicated. The problem is, your request queue is local to a host. So > if the client creates the Future on S1 on host A, and calls for > get-s1-result, and he is routed to host B? That Future will be missing. > > So what you need is to turn that atom map of Futures into a distributed > one. You could still have the Future atom map, but as the last step of each > Future, you need to update the distributed map with the result or error. > And if you want statuses, in your loop, you should also update it for > status. So on get-s1-result, you just check the value of that distributed > map. Each host still processes their own share of requests, but the > distributed map exposes their result and processing status to all other > hosts. > > There's many other ways to handle this issue. For example, I believe you > can route the client to a direct connection to the particular host who > handled S1, so that calls to get-s1-result go to that specific host. The > downside is, it gets harder to evenly distribute the polls. Also, it takes > more complex infrastructure to do that, all hosts must have their IPs > exposed to the clients for example. Another way, is the VIP might be able > to support smarter routing, based on some indicator, or you need to use a > Master host, which delegates back, and has that logic itself. > > An alternate way, is to let go of the polling, and instead have a push > model. Your server could call the client to tell it the request is handled. > This also has its own complexities and trade offs. > > Anyways, in a distributed environment, async and non-blocking becomes > quite a bit more complex. > > > > On Monday, 14 May 2018 17:35:39 UTC-7, Didier wrote: >> >> Its hard to answer without additional detail. >> >> I'll make some assumptions, and answer assuming those are true: >> >> 1) I assume your S1 API is blocking, and that each request to it is >> handled on its own thread, and that those threads come form a fixed size >> thread pool with a size of 30. >> >> 2) I assume that S2 is also blocking, and that it returns a promise when >> you call it. And that you need to keep polling another API, that I'll call >> get-s2-result which takes the promise, and is also blocking, and returns >> the result, error, or that its still not available. >> >> 3) I assume you want to turn your blocking S1 API, into a pseudo >> non-blocking behavior. >> >> 4) Thus, you would have S1 return a promise. When called, you do not >> process the request, but you queue the request in a "to be processed" >> queue, and you return a promise that eventually, the request will be >> processed and will have a value, or an error. >> >> 5) Similarly, you need a way for the client to check the promise, thus >> you will also expose a blocking API that I will call get-s1-result which >> takes the promise and returns either the result, an error, or that it's not >> available yet. >> >> 6) Your promise will take the form of a GUID that uniquely identifies the >> queued request. >> >> 7) This is your APIs design. Your clients can now start work and >> integration with your APIs, while you implement its functionality. >> >> 8) Now you need to implement the queuing up of requests. This is where >> you have options, and core.async is one of them. I do agree with the advice >> of not using core.async unless simpler tools don't work. So I will start >> with a simpler tool: Future, and a global atom map from promise GUID to >> request map. >> >> 9) So you create a global atom, which contains a map from GUID -> FUTURE. >> >> 10) On every request to S1, you create a new GUID and Future, and you >> swap! assoc the GUID with the Future. >> >> 11) The Future is your request handler. So in it, you synchronously >> handle the request, whatever that means for you. So maybe you do some >> processing, and then you call S2, and then you loop, and every 100ms, in >> the loop, you call get-s2-result until it returns an error or a result. >> Every time you loop, you check that the time its been since the time you >> started looping is not more then X timeout, so that you don't loop forever. >> If you eventually get a result or an error, you handle them however you >> need too, and eventually your future itself returns a result or an error. >> Its important you design the future task to timeout eventually. So that you >> don't leak futures that get stuck in infinite loops. So you must be able to >> deterministically know that the future will finish. >> >> 12) Now you implement get-s1-result. Whenever it is called, you get the >> future from the
Re: Asynchronous http poll
Oh, I forgot something important. If you're hoping to have multiple hosts, and run this application in a distributed way, you really should not do this that way. Things get a lot more complicated. The problem is, your request queue is local to a host. So if the client creates the Future on S1 on host A, and calls for get-s1-result, and he is routed to host B? That Future will be missing. So what you need is to turn that atom map of Futures into a distributed one. You could still have the Future atom map, but as the last step of each Future, you need to update the distributed map with the result or error. And if you want statuses, in your loop, you should also update it for status. So on get-s1-result, you just check the value of that distributed map. Each host still processes their own share of requests, but the distributed map exposes their result and processing status to all other hosts. There's many other ways to handle this issue. For example, I believe you can route the client to a direct connection to the particular host who handled S1, so that calls to get-s1-result go to that specific host. The downside is, it gets harder to evenly distribute the polls. Also, it takes more complex infrastructure to do that, all hosts must have their IPs exposed to the clients for example. Another way, is the VIP might be able to support smarter routing, based on some indicator, or you need to use a Master host, which delegates back, and has that logic itself. An alternate way, is to let go of the polling, and instead have a push model. Your server could call the client to tell it the request is handled. This also has its own complexities and trade offs. Anyways, in a distributed environment, async and non-blocking becomes quite a bit more complex. On Monday, 14 May 2018 17:35:39 UTC-7, Didier wrote: > > Its hard to answer without additional detail. > > I'll make some assumptions, and answer assuming those are true: > > 1) I assume your S1 API is blocking, and that each request to it is > handled on its own thread, and that those threads come form a fixed size > thread pool with a size of 30. > > 2) I assume that S2 is also blocking, and that it returns a promise when > you call it. And that you need to keep polling another API, that I'll call > get-s2-result which takes the promise, and is also blocking, and returns > the result, error, or that its still not available. > > 3) I assume you want to turn your blocking S1 API, into a pseudo > non-blocking behavior. > > 4) Thus, you would have S1 return a promise. When called, you do not > process the request, but you queue the request in a "to be processed" > queue, and you return a promise that eventually, the request will be > processed and will have a value, or an error. > > 5) Similarly, you need a way for the client to check the promise, thus you > will also expose a blocking API that I will call get-s1-result which takes > the promise and returns either the result, an error, or that it's not > available yet. > > 6) Your promise will take the form of a GUID that uniquely identifies the > queued request. > > 7) This is your APIs design. Your clients can now start work and > integration with your APIs, while you implement its functionality. > > 8) Now you need to implement the queuing up of requests. This is where you > have options, and core.async is one of them. I do agree with the advice of > not using core.async unless simpler tools don't work. So I will start with > a simpler tool: Future, and a global atom map from promise GUID to request > map. > > 9) So you create a global atom, which contains a map from GUID -> FUTURE. > > 10) On every request to S1, you create a new GUID and Future, and you > swap! assoc the GUID with the Future. > > 11) The Future is your request handler. So in it, you synchronously handle > the request, whatever that means for you. So maybe you do some processing, > and then you call S2, and then you loop, and every 100ms, in the loop, you > call get-s2-result until it returns an error or a result. Every time you > loop, you check that the time its been since the time you started looping > is not more then X timeout, so that you don't loop forever. If you > eventually get a result or an error, you handle them however you need too, > and eventually your future itself returns a result or an error. Its > important you design the future task to timeout eventually. So that you > don't leak futures that get stuck in infinite loops. So you must be able to > deterministically know that the future will finish. > > 12) Now you implement get-s1-result. Whenever it is called, you get the > future from the global atom map of futures, and you call future-done? on > it. If false, you return that the result is not available yet. If it is > done, you deref the future, swap! dessoc the mapEntry for it from your > global atom, and return the result or error. > > The only danger of this approach,
Re: Asynchronous http poll
Its hard to answer without additional detail. I'll make some assumptions, and answer assuming those are true: 1) I assume your S1 API is blocking, and that each request to it is handled on its own thread, and that those threads come form a fixed size thread pool with a size of 30. 2) I assume that S2 is also blocking, and that it returns a promise when you call it. And that you need to keep polling another API, that I'll call get-s2-result which takes the promise, and is also blocking, and returns the result, error, or that its still not available. 3) I assume you want to turn your blocking S1 API, into a pseudo non-blocking behavior. 4) Thus, you would have S1 return a promise. When called, you do not process the request, but you queue the request in a "to be processed" queue, and you return a promise that eventually, the request will be processed and will have a value, or an error. 5) Similarly, you need a way for the client to check the promise, thus you will also expose a blocking API that I will call get-s1-result which takes the promise and returns either the result, an error, or that it's not available yet. 6) Your promise will take the form of a GUID that uniquely identifies the queued request. 7) This is your APIs design. Your clients can now start work and integration with your APIs, while you implement its functionality. 8) Now you need to implement the queuing up of requests. This is where you have options, and core.async is one of them. I do agree with the advice of not using core.async unless simpler tools don't work. So I will start with a simpler tool: Future, and a global atom map from promise GUID to request map. 9) So you create a global atom, which contains a map from GUID -> FUTURE. 10) On every request to S1, you create a new GUID and Future, and you swap! assoc the GUID with the Future. 11) The Future is your request handler. So in it, you synchronously handle the request, whatever that means for you. So maybe you do some processing, and then you call S2, and then you loop, and every 100ms, in the loop, you call get-s2-result until it returns an error or a result. Every time you loop, you check that the time its been since the time you started looping is not more then X timeout, so that you don't loop forever. If you eventually get a result or an error, you handle them however you need too, and eventually your future itself returns a result or an error. Its important you design the future task to timeout eventually. So that you don't leak futures that get stuck in infinite loops. So you must be able to deterministically know that the future will finish. 12) Now you implement get-s1-result. Whenever it is called, you get the future from the global atom map of futures, and you call future-done? on it. If false, you return that the result is not available yet. If it is done, you deref the future, swap! dessoc the mapEntry for it from your global atom, and return the result or error. The only danger of this approach, is that the Future queue is unbounded. So what happens is that clients can call S1 and get-s1-result with at most 30 concurrent request. That's because I assumed your APIs are blocking and bounded on a shared fixed thread pool of size 30. Now say it takes you 1 second to process on average an S1 request, so your future will finish on average in 1 second, and you time them out at 5 seconds. Now say we go for worst case scenario. This means say S2 is down, so all requests take the max of 5 seconds to be handled. Now say your clients are also maxing out your concurrency for S1, so you get around 30 concurrent request constantly. Say S1 takes 100ms to return the promise. What you get is this: * Every second, you are creating 300 Future, because every 100ms, you process 30 new S1 requests. So say we are at the beginning, you have 0 Future, one second later, you have 300, 5 second later, you have 1500, but your first 300 timeout, so you end up with 1200. At the 6th second, you have 1200 again, since 300 more were queued, but 300 more timed out, and from this point on, every second you have 1200 open Futures, with a max of 1500. Thus you need to make sure you can handle 1500 open threads on your host. Indirectly, this stabilizes because you made sure your Future tasks time out at 5 second, and because your S1 API is itself bounded to 30 concurrent request max. If, you'd prefer to not rely on the bound of the S1 requests, and you have a hard time knowing the timing of your S1, you can keep track of the count of queued Future, and on a request to S1 where the count is above your bound, you return an error, instead of a promise, asking the client to wait a bit, and retry the call in a bit, where you have more resourced available. I hope this helps. On Tuesday, 8 May 2018 13:45:00 UTC-7, Brjánn Ljótsson wrote: > > Hi! > > I'm writing a server-side (S1) function that initiates an action on > another server (S2)
Re: Asynchronous http poll
Hi Oliy, I really appreciate your input since I'm totally new to writing asynchronous tasks. If I understand you correctly, promises are a better way to go if only one value is collected from S2 by S1. However, S1 will keep polling S2 (once every 1.5 secs, by S2's API specification) for updates on the status of the task that S2 is running. The task can be in several different states and S1 needs to know the current state of the task and pass it on to the client. So, actually, multiple and different values will be sent down the channel. The polling will quit once the task has been completed or timed out (ca 5 minutes). That's why I went for core.async as it seems to be suitable for launching a separate "thread" that takes care of the polling. Does this make any sense as a rationale for using core.async? Thanks! Brjánn On 13 May 2018 at 20:58, Oliver Hinewrote: > Hi, > > Not a direct answer, but something that may help you simplify your problem: > > I have a general rule to avoid core.async when only one value would ever > be sent down the channel. For these use cases promises are an order of > magnitude simpler, giving you control of the thread of operation, simple > testing for completion (future-done?), simple timeouts, and no cleanup > required afterwards. > > From what I understand, a promise would fit your requirements and I think > would be much easier to reason about. > > Hope this helps, > Oliy > > > On Tuesday, 8 May 2018 21:45:00 UTC+1, Brjánn Ljótsson wrote: >> >> Hi! >> >> I'm writing a server-side (S1) function that initiates an action on >> another server (S2) and regularly checks if the action has finished >> (failed/completed). It should also be possible for a client to ask S1 for >> the status of the action performed by S2. >> >> My idea is to create an uid on S1 that represents the action and the uid >> is returned to the client. S1 then asynchronously polls S2 for action >> status and updates an atom with the uid as key and status as value. The >> client can then request the status of the uid from S1. Below is a link to >> my proof of concept code (without any code for the client requests or >> timeout guards) - and it is my first try at writing code using core.async. >> >> https://gist.github.com/brjann/d80f1709b3c17ef10a4fc89ae693927f >> >> The code can be tested with for example (start-poll) or (repeatedly 10 >> start-poll) to test the behavior when multiple requests are made. >> >> The code seems to work, but is it a correct use of core.async? One thing >> I'm wondering is if the init-request and poll functions should use threads >> instead of go-blocks, since the http requests may take a few hundred >> milliseconds and many different requests (with different uids) could be >> made simultaneously. I've read that "long-running" tasks should not be put >> in go blocks. I haven't figured out how to use threads though. >> >> I would be thankful for any input! >> >> Best wishes, >> Brjánn Ljótsson >> > -- > You received this message because you are subscribed to the Google > Groups "Clojure" group. > To post to this group, send email to clojure@googlegroups.com > Note that posts from new members are moderated - please be patient with > your first post. > To unsubscribe from this group, send email to > clojure+unsubscr...@googlegroups.com > For more options, visit this group at > http://groups.google.com/group/clojure?hl=en > --- > You received this message because you are subscribed to the Google Groups > "Clojure" group. > To unsubscribe from this group and stop receiving emails from it, send an > email to clojure+unsubscr...@googlegroups.com. > For more options, visit https://groups.google.com/d/optout. > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: Asynchronous http poll
Hi, Not a direct answer, but something that may help you simplify your problem: I have a general rule to avoid core.async when only one value would ever be sent down the channel. For these use cases promises are an order of magnitude simpler, giving you control of the thread of operation, simple testing for completion (future-done?), simple timeouts, and no cleanup required afterwards. >From what I understand, a promise would fit your requirements and I think would be much easier to reason about. Hope this helps, Oliy On Tuesday, 8 May 2018 21:45:00 UTC+1, Brjánn Ljótsson wrote: > > Hi! > > I'm writing a server-side (S1) function that initiates an action on > another server (S2) and regularly checks if the action has finished > (failed/completed). It should also be possible for a client to ask S1 for > the status of the action performed by S2. > > My idea is to create an uid on S1 that represents the action and the uid > is returned to the client. S1 then asynchronously polls S2 for action > status and updates an atom with the uid as key and status as value. The > client can then request the status of the uid from S1. Below is a link to > my proof of concept code (without any code for the client requests or > timeout guards) - and it is my first try at writing code using core.async. > > https://gist.github.com/brjann/d80f1709b3c17ef10a4fc89ae693927f > > The code can be tested with for example (start-poll) or (repeatedly 10 > start-poll) to test the behavior when multiple requests are made. > > The code seems to work, but is it a correct use of core.async? One thing > I'm wondering is if the init-request and poll functions should use threads > instead of go-blocks, since the http requests may take a few hundred > milliseconds and many different requests (with different uids) could be > made simultaneously. I've read that "long-running" tasks should not be put > in go blocks. I haven't figured out how to use threads though. > > I would be thankful for any input! > > Best wishes, > Brjánn Ljótsson > -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Asynchronous http poll
Hi! I'm writing a server-side (S1) function that initiates an action on another server (S2) and regularly checks if the action has finished (failed/completed). It should also be possible for a client to ask S1 for the status of the action performed by S2. My idea is to create an uid on S1 that represents the action and the uid is returned to the client. S1 then asynchronously polls S2 for action status and updates an atom with the uid as key and status as value. The client can then request the status of the uid from S1. Below is a link to my proof of concept code (without any code for the client requests or timeout guards) - and it is my first try at writing code using core.async. https://gist.github.com/brjann/d80f1709b3c17ef10a4fc89ae693927f The code can be tested with for example (start-poll) or (repeatedly 10 start-poll) to test the behavior when multiple requests are made. The code seems to work, but is it a correct use of core.async? One thing I'm wondering is if the init-request and poll functions should use threads instead of go-blocks, since the http requests may take a few hundred milliseconds and many different requests (with different uids) could be made simultaneously. I've read that "long-running" tasks should not be put in go blocks. I haven't figured out how to use threads though. I would be thankful for any input! Best wishes, Brjánn Ljótsson -- You received this message because you are subscribed to the Google Groups "Clojure" group. To post to this group, send email to clojure@googlegroups.com Note that posts from new members are moderated - please be patient with your first post. To unsubscribe from this group, send email to clojure+unsubscr...@googlegroups.com For more options, visit this group at http://groups.google.com/group/clojure?hl=en --- You received this message because you are subscribed to the Google Groups "Clojure" group. To unsubscribe from this group and stop receiving emails from it, send an email to clojure+unsubscr...@googlegroups.com. For more options, visit https://groups.google.com/d/optout.