[ 
https://issues.apache.org/jira/browse/PHOENIX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120495#comment-15120495
 ] 

Josh Elser commented on PHOENIX-2634:
-------------------------------------

Hi [~warwithin]. This would be neat to play with some more.

I've experimented with putting multiple PQS instances behind a "dumb" load 
balancer (haproxy, specifically) with success. This has some edge cases (which 
I've talked with [~jamestaylor] about somewhere previously), notable 
automatically resuming failed queries (assuming a static dataset). These are 
the same sorts of problems you'd have to address to implement something like 
pagination/cursor.

I've also added a [new 
attribute|http://calcite.apache.org/docs/avatica_protobuf_reference.html#rpcmetadata]
 that is returned by PQS at the wire-level for every request. This would let 
you implement your own client-routing decisions so that you could have full 
control over how a client "routes" its requests. This is just a hammer though, 
not a house.

When you start getting into load balancing and HA, service discovery also 
become an important piece (how do your clients actually find *where* your 
service is). YARN-913 introduce a "registry" which currently has a 
ZooKeeper-backed solution for service discovery. I believe there is some work 
on a DNS frontend for this, but I'm not sure the state of it or where it's 
being tracked. There are many other systems out there which could be leveraged 
for this aspect.

So, this is a long-winded way to say: what do you think should actually be 
done? PQS is designed to scale horizontally alreardy (as its REST-iness would 
imply), so what do you think the next step would be? Personally, I think trying 
to improve the edges in running behind a "dumb" loadbalancer and then look into 
recommendations on how DNS could be put in front of that.

Clients can then use a single name to refer to some "farm" of PQS instances, 
with the load balancer handling the routing logic. This would provide HA, 
service discovery and load balancing.

One of these days, I'll also try to write up some goodness to deploy PQS on top 
of Apache Slider to get some auto-magic scaling across a YARN instance. Not 
sure if my long-term vision would hinge on Slider or just be a deployment 
option.

> Dynamic service discovery for QueryServer
> -----------------------------------------
>
>                 Key: PHOENIX-2634
>                 URL: https://issues.apache.org/jira/browse/PHOENIX-2634
>             Project: Phoenix
>          Issue Type: New Feature
>            Reporter: YoungWoo Kim
>
> It would be nice if Phoenix QueryServer supports a feature like HIVE-7935 for 
> HA and load balancing



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to