Totally make sense. The only inconvenience is that every user has to repeat the 
same work by setting up the sticky sessions. 

-Jiandan 
> On Aug 9, 2018, at 1:08 PM, Josh Elser <[email protected]> wrote:
> 
> The decision of avoiding "routing logic" in the client was that other systems 
> can do a better job than we can in Avatica. There are other systems which are 
> specifically designed for doing this -- it's a clear architectural boundary 
> that says one Avatica client expects to talk to one Avatica server.
> 
> On 8/9/18 1:21 PM, JD Zheng wrote:
>> Hi, Josh,
>> Thank you for sharing your experience and the nice writeup in the 
>> hortonworks website too. It’s very helpful. I am just curious why “not 
>> implement routing logic in the client” was one of the original design goals? 
>> Doesn’t it make easier to use Avatica? I agree that the sharing state 
>> between Avatica servers is too much complexity that does not worth it.
>> Is the concern of client “smarts” that the retry request most likely goes to 
>> the same server and fails again and thus the over-all response time will be 
>> unnecessarily too long?
>> -Jiandan
>>> On Aug 9, 2018, at 8:43 AM, Josh Elser <[email protected]> wrote:
>>> 
>>> Hi Jiandan,
>>> 
>>> Glad you found my write-up on this. One of the original design goals was to 
>>> *not* implement routing logic in the client. Sticky-sessions is by far the 
>>> easiest way to implement this.
>>> 
>>> There is some retry logic in the Avatica client to resubmit requests when a 
>>> server responds that it doesn't have a connection/statement cached that the 
>>> client thinks it should (e.g. the load balancer flipped the client to a 
>>> newer server). I'm still a little concerned about this level of "smarts" :)
>>> 
>>> I don't know if there is a fancier solution that we can do in Avatica. We 
>>> could consider sharing state between Avatica servers, but I think it is 
>>> database-dependent as to whether or not you could correctly reconstruct an 
>>> iteration through a result set.
>>> 
>>> I had talked with a dev on the Apache Hive project. He suggested that 
>>> HiveServer2 just fails the query when the client is mid-query and the 
>>> server dies (which is reasonably -- servers failing should be an infrequent 
>>> operation).
>>> 
>>> 
>>> On 8/8/18 8:09 PM, JD Zheng wrote:
>>>> Hi,
>>>> Our query engine is using calcite as parser/optimizer and enumerable as 
>>>> runtime if needed to federate different storage engines. We are trying to 
>>>> enable JDBC access to our query engine. Everything works smoothly when we 
>>>> only have one calcite/avatica server.
>>>> However, JDBC calls will fail if we run multiple instances of 
>>>> calcite/avatica servers behind a generic load-balancer. Given that JDBC 
>>>> server is not stateless, this problem was not a surprise. I searched 
>>>> around and here are the two options suggested by phoenix developers 
>>>> (https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html
>>>>  
>>>> <https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html>):
>>>> 1. sticky sessions: make the router to always route a client to a given 
>>>> server.
>>>> 2. client-driven routing: implementing Avarice’s protocol which passes an 
>>>> identifier to the load balancer to control how the request is routed to 
>>>> the backend servers.
>>>> Before we rush into any implementation, we would really appreciate it if 
>>>> anyone can share experience or thoughts regarding this issue. Thanks,
>>>> -Jiandan

Reply via email to