[
https://issues.apache.org/jira/browse/PHOENIX-2634?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15120495#comment-15120495
]
Josh Elser commented on PHOENIX-2634:
-------------------------------------
Hi [~warwithin]. This would be neat to play with some more.
I've experimented with putting multiple PQS instances behind a "dumb" load
balancer (haproxy, specifically) with success. This has some edge cases (which
I've talked with [~jamestaylor] about somewhere previously), notable
automatically resuming failed queries (assuming a static dataset). These are
the same sorts of problems you'd have to address to implement something like
pagination/cursor.
I've also added a [new
attribute|http://calcite.apache.org/docs/avatica_protobuf_reference.html#rpcmetadata]
that is returned by PQS at the wire-level for every request. This would let
you implement your own client-routing decisions so that you could have full
control over how a client "routes" its requests. This is just a hammer though,
not a house.
When you start getting into load balancing and HA, service discovery also
become an important piece (how do your clients actually find *where* your
service is). YARN-913 introduce a "registry" which currently has a
ZooKeeper-backed solution for service discovery. I believe there is some work
on a DNS frontend for this, but I'm not sure the state of it or where it's
being tracked. There are many other systems out there which could be leveraged
for this aspect.
So, this is a long-winded way to say: what do you think should actually be
done? PQS is designed to scale horizontally alreardy (as its REST-iness would
imply), so what do you think the next step would be? Personally, I think trying
to improve the edges in running behind a "dumb" loadbalancer and then look into
recommendations on how DNS could be put in front of that.
Clients can then use a single name to refer to some "farm" of PQS instances,
with the load balancer handling the routing logic. This would provide HA,
service discovery and load balancing.
One of these days, I'll also try to write up some goodness to deploy PQS on top
of Apache Slider to get some auto-magic scaling across a YARN instance. Not
sure if my long-term vision would hinge on Slider or just be a deployment
option.
> Dynamic service discovery for QueryServer
> -----------------------------------------
>
> Key: PHOENIX-2634
> URL: https://issues.apache.org/jira/browse/PHOENIX-2634
> Project: Phoenix
> Issue Type: New Feature
> Reporter: YoungWoo Kim
>
> It would be nice if Phoenix QueryServer supports a feature like HIVE-7935 for
> HA and load balancing
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)