Totally make sense. The only inconvenience is that every user has to repeat the same work by setting up the sticky sessions.
-Jiandan > On Aug 9, 2018, at 1:08 PM, Josh Elser <[email protected]> wrote: > > The decision of avoiding "routing logic" in the client was that other systems > can do a better job than we can in Avatica. There are other systems which are > specifically designed for doing this -- it's a clear architectural boundary > that says one Avatica client expects to talk to one Avatica server. > > On 8/9/18 1:21 PM, JD Zheng wrote: >> Hi, Josh, >> Thank you for sharing your experience and the nice writeup in the >> hortonworks website too. It’s very helpful. I am just curious why “not >> implement routing logic in the client” was one of the original design goals? >> Doesn’t it make easier to use Avatica? I agree that the sharing state >> between Avatica servers is too much complexity that does not worth it. >> Is the concern of client “smarts” that the retry request most likely goes to >> the same server and fails again and thus the over-all response time will be >> unnecessarily too long? >> -Jiandan >>> On Aug 9, 2018, at 8:43 AM, Josh Elser <[email protected]> wrote: >>> >>> Hi Jiandan, >>> >>> Glad you found my write-up on this. One of the original design goals was to >>> *not* implement routing logic in the client. Sticky-sessions is by far the >>> easiest way to implement this. >>> >>> There is some retry logic in the Avatica client to resubmit requests when a >>> server responds that it doesn't have a connection/statement cached that the >>> client thinks it should (e.g. the load balancer flipped the client to a >>> newer server). I'm still a little concerned about this level of "smarts" :) >>> >>> I don't know if there is a fancier solution that we can do in Avatica. We >>> could consider sharing state between Avatica servers, but I think it is >>> database-dependent as to whether or not you could correctly reconstruct an >>> iteration through a result set. >>> >>> I had talked with a dev on the Apache Hive project. He suggested that >>> HiveServer2 just fails the query when the client is mid-query and the >>> server dies (which is reasonably -- servers failing should be an infrequent >>> operation). >>> >>> >>> On 8/8/18 8:09 PM, JD Zheng wrote: >>>> Hi, >>>> Our query engine is using calcite as parser/optimizer and enumerable as >>>> runtime if needed to federate different storage engines. We are trying to >>>> enable JDBC access to our query engine. Everything works smoothly when we >>>> only have one calcite/avatica server. >>>> However, JDBC calls will fail if we run multiple instances of >>>> calcite/avatica servers behind a generic load-balancer. Given that JDBC >>>> server is not stateless, this problem was not a surprise. I searched >>>> around and here are the two options suggested by phoenix developers >>>> (https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html >>>> >>>> <https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html>): >>>> 1. sticky sessions: make the router to always route a client to a given >>>> server. >>>> 2. client-driven routing: implementing Avarice’s protocol which passes an >>>> identifier to the load balancer to control how the request is routed to >>>> the backend servers. >>>> Before we rush into any implementation, we would really appreciate it if >>>> anyone can share experience or thoughts regarding this issue. Thanks, >>>> -Jiandan
