Somehow, my impression was that the Avatica implementation was such that  the 
states are carried in the request/response so that server does not need to 
maintain any specific state for a jdbc session. Just as Julian described. 
Obviously I was wrong. Just curious what is the exact states maintained at the 
server. Maybe it is possible to make it server independent without sacrificing 
too much performance. 

-Jiandan

> On Aug 9, 2018, at 10:28 AM, Julian Hyde <[email protected]> wrote:
> 
> One important question is: what state is shared among servers? Some pieces of 
> state: connection state, statement state, result set state (parameter values 
> and position in a scroll). It is not unusual for a cluster of servers to use 
> a shared cache for some of the larger / slower-changing pieces of state.
> 
> I could imagine a system where connection and statements are shared but 
> result set state is not. I could imagine another system where nothing is 
> shared. The solution would be different for those cases.
> 
> A possible technical solution in Avatica might be for Avatica to transmit all 
> necessary state in the return from each RPC, and the next RPC to transmit 
> that state. Thus all necessary state is on the client, and in each RPC call.
> 
> But such a solution would not be easy to implement, and would not perform as 
> well as a system that makes some reasonable assumptions about what state can 
> be left on the server.
> 
> Julian
> 
> 
> 
> 
>> On Aug 9, 2018, at 8:43 AM, Josh Elser <[email protected]> wrote:
>> 
>> Hi Jiandan,
>> 
>> Glad you found my write-up on this. One of the original design goals was to 
>> *not* implement routing logic in the client. Sticky-sessions is by far the 
>> easiest way to implement this.
>> 
>> There is some retry logic in the Avatica client to resubmit requests when a 
>> server responds that it doesn't have a connection/statement cached that the 
>> client thinks it should (e.g. the load balancer flipped the client to a 
>> newer server). I'm still a little concerned about this level of "smarts" :)
>> 
>> I don't know if there is a fancier solution that we can do in Avatica. We 
>> could consider sharing state between Avatica servers, but I think it is 
>> database-dependent as to whether or not you could correctly reconstruct an 
>> iteration through a result set.
>> 
>> I had talked with a dev on the Apache Hive project. He suggested that 
>> HiveServer2 just fails the query when the client is mid-query and the server 
>> dies (which is reasonably -- servers failing should be an infrequent 
>> operation).
>> 
>> 
>> On 8/8/18 8:09 PM, JD Zheng wrote:
>>> Hi,
>>> Our query engine is using calcite as parser/optimizer and enumerable as 
>>> runtime if needed to federate different storage engines. We are trying to 
>>> enable JDBC access to our query engine. Everything works smoothly when we 
>>> only have one calcite/avatica server.
>>> However, JDBC calls will fail if we run multiple instances of 
>>> calcite/avatica servers behind a generic load-balancer. Given that JDBC 
>>> server is not stateless, this problem was not a surprise. I searched around 
>>> and here are the two options suggested by phoenix developers 
>>> (https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html
>>>  
>>> <https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html>):
>>> 1. sticky sessions: make the router to always route a client to a given 
>>> server.
>>> 2. client-driven routing: implementing Avarice’s protocol which passes an 
>>> identifier to the load balancer to control how the request is routed to the 
>>> backend servers.
>>> Before we rush into any implementation, we would really appreciate it if 
>>> anyone can share experience or thoughts regarding this issue. Thanks,
>>> -Jiandan
> 

Reply via email to