The decision of avoiding "routing logic" in the client was that other systems can do a better job than we can in Avatica. There are other systems which are specifically designed for doing this -- it's a clear architectural boundary that says one Avatica client expects to talk to one Avatica server.

On 8/9/18 1:21 PM, JD Zheng wrote:
Hi, Josh,

Thank you for sharing your experience and the nice writeup in the hortonworks 
website too. It’s very helpful. I am just curious why “not implement routing 
logic in the client” was one of the original design goals? Doesn’t it make 
easier to use Avatica? I agree that the sharing state between Avatica servers 
is too much complexity that does not worth it.

Is the concern of client “smarts” that the retry request most likely goes to 
the same server and fails again and thus the over-all response time will be 
unnecessarily too long?

-Jiandan

On Aug 9, 2018, at 8:43 AM, Josh Elser <josh.el...@gmail.com> wrote:

Hi Jiandan,

Glad you found my write-up on this. One of the original design goals was to 
*not* implement routing logic in the client. Sticky-sessions is by far the 
easiest way to implement this.

There is some retry logic in the Avatica client to resubmit requests when a server 
responds that it doesn't have a connection/statement cached that the client thinks it 
should (e.g. the load balancer flipped the client to a newer server). I'm still a little 
concerned about this level of "smarts" :)

I don't know if there is a fancier solution that we can do in Avatica. We could 
consider sharing state between Avatica servers, but I think it is 
database-dependent as to whether or not you could correctly reconstruct an 
iteration through a result set.

I had talked with a dev on the Apache Hive project. He suggested that 
HiveServer2 just fails the query when the client is mid-query and the server 
dies (which is reasonably -- servers failing should be an infrequent operation).


On 8/8/18 8:09 PM, JD Zheng wrote:
Hi,
Our query engine is using calcite as parser/optimizer and enumerable as runtime 
if needed to federate different storage engines. We are trying to enable JDBC 
access to our query engine. Everything works smoothly when we only have one 
calcite/avatica server.
However, JDBC calls will fail if we run multiple instances of calcite/avatica servers 
behind a generic load-balancer. Given that JDBC server is not stateless, this problem 
was not a surprise. I searched around and here are the two options suggested by 
phoenix developers 
(https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html
 
<https://community.hortonworks.com/articles/9377/deploying-the-phoenix-query-server-in-production-e.html>):
1. sticky sessions: make the router to always route a client to a given server.
2. client-driven routing: implementing Avarice’s protocol which passes an 
identifier to the load balancer to control how the request is routed to the 
backend servers.
Before we rush into any implementation, we would really appreciate it if anyone 
can share experience or thoughts regarding this issue. Thanks,
-Jiandan

Reply via email to