[ https://issues.apache.org/jira/browse/PHOENIX-3654?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15894985#comment-15894985 ]
Rahul Shrivastava commented on PHOENIX-3654: -------------------------------------------- --Do we need to pull in curator for this? Is Curator buying enough over the standard Java ZK API to make it worth it? -- Curator has service discovery/service registration mechanism. We can use the Zk client to achieve the same. Will discuss with [~jamestaylor] on this. --How will you know how many connections each PQS instance has from ZooKeeper? You said the znode contents were just going to be a JDBC url, right? - I think [~apurtell] suggested that store a Json in zknode. --How are multiple clients of the load balancer aware of each others' connections? Does this imply that the LB is actually a daemon running outside of the clients' JVM? - Clients will only know the load on PQS servers via the zookeeper via ZKnode watcher. It will not be aware of each other load. --Consider the following case: --PQS is running, maintaining its znode in ZK --Client begins check to PQS --PQS swaps-out/pauses --Client times out and thinks PQS node is bad --PQS recovers without losing ZK session --This would result in the client never using this PQS instance again. Good point. The client will keep its location of PQS updated via Zookeeper. There could be a mechanism to update the full PQS load data via by going through all znodes data. I need to think more on this issue. -- And, on that note, how how PQS failure be handled (or not) by this design? - If PQS fails, their is no failover of the sessions. The client has to retry. -- A concern is that there is an upper limit on the number of watchers that a ZooKeeper server can support. Considering HBase and whatever else is already setting/using watchers of its own, we should try to make sure to be "good citizens". ZOOKEEPER-1177 for context. As you've described, this would result in each client keeping a zookeeper connection (which, itself, is overhead) open as well as the watchers set server-side. Architecturally, this is what makes ZK a bad choice for a load balancer – once you find a PQS to talk to, you don't need to maintain the ZK connection anymore. You only need the ZK connection when the PQS instance you were talking to goes away. I will discuss more with [~jamestaylor] [~apurtell] on this. I was not aware of the connection limit issue and surely this will limit ability of single ZK cluster to support large set of PQS nodes. > Load Balancer for thin client > ----------------------------- > > Key: PHOENIX-3654 > URL: https://issues.apache.org/jira/browse/PHOENIX-3654 > Project: Phoenix > Issue Type: New Feature > Affects Versions: 4.8.0 > Environment: Linux 3.13.0-107-generic kernel, v4.9.0-HBase-0.98 > Reporter: Rahul Shrivastava > Fix For: 4.9.0 > > Attachments: LoadBalancerDesign.pdf > > Original Estimate: 240h > Remaining Estimate: 240h > > We have been having internal discussion on load balancer for thin client for > PQS. The general consensus we have is to have an embedded load balancer with > the thin client instead of using external load balancer such as haproxy. The > idea is to not to have another layer between client and PQS. This reduces > operational cost for system, which currently leads to delay in executing > projects. > But this also comes with challenge of having an embedded load balancer which > can maintain sticky sessions, do fair load balancing knowing the load > downstream of PQS server. In addition, load balancer needs to know location > of multiple PQS server. Now, the thin client needs to keep track of PQS > servers via zookeeper ( or other means). > In the new design, the client ( PQS client) , it is proposed to have an > embedded load balancer. > Where will the load Balancer sit ? > The load load balancer will embedded within the app server client. > How will the load balancer work ? > Load balancer will contact zookeeper to get location of PQS. In this case, > PQS needs to register to ZK itself once it comes online. Zookeeper location > is in hbase-site.xml. It will maintain a small cache of connection to the > PQS. When a request comes in, it will check for an open connection from the > cache. > How will load balancer know load on PQS ? > To start with, it will pick a random open connection to PQS. This means that > load balancer does not know PQS load. Later , we can augment the code so that > thin client can receive load info from PQS and make intelligent decisions. > How will load balancer maintain sticky sessions ? > While we still need to investigate how to implement sticky sessions. We can > look for some open source implementation for the same. > How will PQS register itself to service locator ? > PQS will have location of zookeeper in hbase-site.xml and it would register > itself to the zookeeper. Thin client will find out PQS location using > zookeeper. -- This message was sent by Atlassian JIRA (v6.3.15#6346)