Somehow, my impression was that the Avatica implementation was such that the states are carried in the request/response so that server does not need to maintain any specific state for a jdbc session. Just as Julian described. Obviously I was wrong. Just curious what is the exact states maintained at the server. Maybe it is possible to make it server independent without sacrificing too much performance.
-Jiandan > On Aug 9, 2018, at 10:28 AM, Julian Hyde <[email protected]> wrote: > > One important question is: what state is shared among servers? Some pieces of > state: connection state, statement state, result set state (parameter values > and position in a scroll). It is not unusual for a cluster of servers to use > a shared cache for some of the larger / slower-changing pieces of state. > > I could imagine a system where connection and statements are shared but > result set state is not. I could imagine another system where nothing is > shared. The solution would be different for those cases. > > A possible technical solution in Avatica might be for Avatica to transmit all > necessary state in the return from each RPC, and the next RPC to transmit > that state. Thus all necessary state is on the client, and in each RPC call. > > But such a solution would not be easy to implement, and would not perform as > well as a system that makes some reasonable assumptions about what state can > be left on the server. > > Julian > > > > >> On Aug 9, 2018, at 8:43 AM, Josh Elser <[email protected]> wrote: >> >> Hi Jiandan, >> >> Glad you found my write-up on this. One of the original design goals was to >> *not* implement routing logic in the client. Sticky-sessions is by far the >> easiest way to implement this. >> >> There is some retry logic in the Avatica client to resubmit requests when a >> server responds that it doesn't have a connection/statement cached that the >> client thinks it should (e.g. the load balancer flipped the client to a >> newer server). I'm still a little concerned about this level of "smarts" :) >> >> I don't know if there is a fancier solution that we can do in Avatica. We >> could consider sharing state between Avatica servers, but I think it is >> database-dependent as to whether or not you could correctly reconstruct an >> iteration through a result set. >> >> I had talked with a dev on the Apache Hive project. He suggested that >> HiveServer2 just fails the query when the client is mid-query and the server >> dies (which is reasonably -- servers failing should be an infrequent >> operation). >> >> >> On 8/8/18 8:09 PM, JD Zheng wrote: >>> Hi, >>> Our query engine is using calcite as parser/optimizer and enumerable as >>> runtime if needed to federate different storage engines. We are trying to >>> enable JDBC access to our query engine. Everything works smoothly when we >>> only have one calcite/avatica server. >>> However, JDBC calls will fail if we run multiple instances of >>> calcite/avatica servers behind a generic load-balancer. Given that JDBC >>> server is not stateless, this problem was not a surprise. I searched around >>> and here are the two options suggested by phoenix developers >>> (https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html >>> >>> <https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html>): >>> 1. sticky sessions: make the router to always route a client to a given >>> server. >>> 2. client-driven routing: implementing Avarice’s protocol which passes an >>> identifier to the load balancer to control how the request is routed to the >>> backend servers. >>> Before we rush into any implementation, we would really appreciate it if >>> anyone can share experience or thoughts regarding this issue. Thanks, >>> -Jiandan >
