Paul, Out of curiosity, what language is your driver for?
On Wed, Sep 11, 2013 at 9:20 AM, Paul LeoNerd <leon...@leonerd.org.uk>wrote: > Having got the first stage of my client connector module nicely working > to a single node, I'm now looking at how to make it cluster-aware, > maintaining multiple connections for reliability and load-spreading. > What are some good strategies to take here? > > My current plan involves connecting to a (randomly chosen from a list?) > seed node, to query the list of peers in the cluster, then make a > selection of some number of those to be "primary" nodes, and some more > as "backup" nodes. The primary nodes will be used to spread actual > query load around, the backups sitting idle simply as a fast way to > failover to some known-working connection if a primary falls over. By > registering an interest in topology and status change messages, the > client can keep the list of available nodes up-to-date. > > 1. What is a good way to handle prepared statements here? Should they > be prepared on all the (primary/all?) nodes, or just one? Some > applications I could imagine having just a handful of heavily-used > prepared statements, so they'd become a hotspot on one node if it > wasn't spread around. But then what to do as new nodes become > elected as primaries? Should they be prepared eagerly on > connection? Lazily at next use? > > 2. Secondly; what are suggested ways to actually spread load among the > primaries? I could imagine a simple round-robin, or something more > fancy involving picking the node with the fewest outstanding > requests, or the one on which we've been responsible for the least > processing time recently, or something else... Do client libraries > generally provide a selection of these mechanisms, or just pick one? > > -- > Paul "LeoNerd" Evans > > leon...@leonerd.org.uk > ICQ# 4135350 | Registered Linux# 179460 > http://www.leonerd.org.uk/ > -- :- a) Alex Popescu Sen. Product Manager @ DataStax @al3xandru