Hey Gordon, Sounds good. Whenever you're ready let us know. In the meantime, we're making good progress on the basic balancer. It should be easy to change the infrastructure to accomodate any requirements that you have.
- Doug On Mon, Dec 27, 2010 at 6:23 AM, gordon <[email protected]> wrote: > Hey Doug, > > I think the sys/RS_STATS table is a good design for this data flow -- > we might create a convenience API to make it easier for the balancer > to grab the data. > > The balancer needs an API that allows us to collect state data and > response / feedback data from the system (operational / performance > metrics) so we can learn the relationships there. The API also needs > to present controls to the balancer so that it can take actions to try > and move the system to states that are associated with good > performance metrics. > > We'll make this more concrete soon ... > > On Dec 24, 7:27 am, Doug Judd <[email protected]> wrote: > > It's a little late, sorry. So, correct me if I'm wrong, the training > data > > is just another name for the feedback that is collected. The training > data > > would come from two places, the sys/RS_STATS table and an additional > method > > of the LoadBalancer receive_monitoring_data() which is how it would be > fed > > the monitoring data. I suspect that's the missing API that Charles was > > referring to. > > > > One thing to keep in mind is that the reason we chose to persist the > > per-range state in the sys/RS_STATS table is so that the system would be > > designed for scale. When doing back-of-the-envelope calculations, I use > 3K > > range servers, 10K ranges per server, and approximately 1K worth of data > per > > range. This comes out to about 30GB which makes it infeasible to pull > the > > data over to the Master and feed it into the LoadBalancer. That's why > we've > > introduced the sys/RS_STATS table. > > > > - Doug > > > > On Thu, Dec 23, 2010 at 10:04 PM, Doug Judd <[email protected]> wrote: > > > I think my confusion came from the Wikipedia definitions for > supervised<http://en.wikipedia.org/wiki/Supervised_learning>vs. > > > unsupervised <http://en.wikipedia.org/wiki/Unsupervised_learning>learning. > The supervised learning page is the only one that discusses the > > > use of training data. > > > > > The descriptions you give for online vs. batch sound very similar. Get > > > feedback, take action, get feedback, take action... From a system > > > infrastructure standpoint, is there any difference? Does the > traditional > > > connotation of the word "batch" come into play at all? For example, > does a > > > batch system accumulate a large "batch" of feedback that is processed > at > > > once, whereas an online system processes feedback more continuously? > > > > > On Thu, Dec 23, 2010 at 7:52 PM, Gordon <[email protected]> wrote: > > > > >> These are great comments -- Doug, on the learning problem, I think you > are > > >> referring to online learning versus batch learning. In an online > learning > > >> setting we we would get system state information, take actions, and > then get > > >> feedback from the system to learn better policies or better estimates > for > > >> the value of actions for any given state. > > > > >> In a batch learning setting we could accumulate training data in the > form > > >> of system state and system performance as feedback -- then the > balancer > > >> takes actions to try and move the system to states that have have > > >> experienced good system performance in the past. > > > > >> In each of the cases we have a notion of collecting feedback (i.e. > system > > >> performance) associated with system state or directly from our actions > as we > > >> go about the various learning tasks so it's supervised learning in > both > > >> cases. > > > > >> On Fri, Dec 24, 2010 at 1:46 AM, Doug Judd <[email protected]> > wrote: > > > > >>> Hi Charles, > > > > >>> Thanks for the feedback. Comments inline ... > > > > >>> On Thu, Dec 23, 2010 at 8:12 AM, Charles <[email protected] > >wrote: > > > > >>>> [...] > > > > >>> 1. Although the design document talks about passing training data to > > >>>> the LoadBalancer object, glancing briefly at the pseudocode for the > > >>>> class definition it's not clear to me what the API is for passing > the > > >>>> training data is or what the format would be. > > > > >>> We were thinking more along the lines of unsupervised learning. We > > >>> certainly could explore a supervised learning approach. Any ideas > for what > > >>> the training data should look like? Also, how could we go about > generating > > >>> it? > > > > >>>> 2. I'm assuming that the LoadBalancer has full access to the items > in > > >>>> the data table itself during the balance operation, in case it wants > > >>>> to collect data about them for use in the prediction. Not clear if > > >>>> this would be useful b/c of the cost involved in gathering this > data, > > >>>> but interesting to explore. > > > > >>> When you say "data table" I assume you're referring to the > sys/RS_STATS > > >>> table. The LoadBalancer will have full access to the items in this > table. > > >>> In fact this table exists solely to feed the LoadBalancer > performance > > >>> statistics. This table is populated by the RangeServers directly to > > >>> minimize the impact of statistics gathering on the system. This > means that > > >>> the load balancer and the RangeServers will need to coordinate on > what > > >>> information gets collected, how often, and how much historical > information > > >>> will be kept around. > > > > >>> 3. I would expect that practical implementations of LoadBalancer > would > > >>>> want a way to serialise their nontrivial state (presumably using HT > > >>>> itself), but not sure if there's any special API support required > for > > >>>> that. (Maybe a reserved table for LB data?) > > > > >>> The LoadBalancer runs in the Master process (at least for now). The > > >>> Master has a meta log (MML) that is written to the underlying DFS. > > >>> Currently the plan is to have the basic balancer serialize state > about > > >>> in-progress balance operations in the MML. We can use the MML to > persist > > >>> other LoadBalancer state as well. However, the MML is designed to > hold a > > >>> very small amount of data. If you think there might be need for > persisting > > >>> a very large amount of state, we can consider another system table > > >>> (sys/balancer). > > > > >>> 4. It may be worth providing a convenience implementation of > > >>>> LoadBalancer that works in the batch setting like the basic > algorithm, > > >>>> i.e., a superclass for load balancers that want to operate once a > day > > >>>> based on data that has been collected in the last 24 hours. > > > > >>> Sounds good. > > > > >>>> 5. A LoadBalancer might want to use different strategies for the > cases > > >>>> of adding a new range server versus high variance among servers. Is > > >>>> there a way for the master to signal which of these situations is > the > > >>>> case? > > > > >>> The monitoring data that gets fed into the LoadBalancer on a regular > > >>> interval (e.g. 30 seconds) will contain range server and range count > > >>> information for all of the range servers. If a new range server > suddenly > > >>> appears with a range count of zero, that would imply that a new > server was > > >>> added. > > > > >>> 6. Of course an effective challenge problem would also require a test > > >>>> workload that is challenging enough to be representative of real > > >>>> usages of the load balancer. As close to real usage as possible > would > > >>>> be best, to try to forestall the danger of designing ML algorithms > > >>>> that are strong enough to learn the features of the synthetic > problem > > >>>> generator but not that of real data. > > > > >>> We'll work on pulling some real-world workload together. The > realtime > > >>> Twitter stream sample <http://dev.twitter.com/pages/streaming_api> > might > > >>> be a good place to start. > > > > >>>> 7. It is unclear what the optimal granularity for aggregating the > > >>>> range counts would be (could be less than 30 sec, or more). Might > > >>>> want to have this settable parameter of the master. Note that this > is > > >>>> orthogonal to how often the master decides to send data to the load > > >>>> balancer, e.g., the master could send data every thirty seconds that > > >>>> are 6 bins of counts recorded every 5 sec. > > > > >>> The LoadBalancer API has a method for publishing the stats gathering > > >>> interval, so balancers would have a way to change it. There's not > much cost > > >>> associated with sending data to the LoadBalancer, so I don't think > the > > >>> vector approach would be necessary. We empirically chose 30 seconds > as the > > >>> default because there is some overhead involved in gathering > statistics, > > >>> including network communication with all of the range servers and > mutex > > >>> locking/unlocking for each range managed by each server. > > > > >>> 8. Wrt the objective functions, different objective performance > > >>>> metrics > > >>>> that are of interest to the user, and the user might want to have > > >>>> knobs to say (e.g.) exactly what SLA they would like satisfied. But > > >>>> it's not clear to me whether this is part of load balancing (i.e., > > >>>> deciding which ranges are served by which server) or auto-scaling > > >>>> (deciding how many servers to have). It may be too early to lock > down > > >>>> an API on this without having more experience with practical > > >>>> SML/Optimization load balancers. > > > > >>> We were thinking that "load average" would be a good overall > objective > > >>> performance metric. But now that you mention it, I suppose > optimizing for > > >>> query latency or overall throughput, might yield a different balance. > > >>> Adding this sort of user input is trivial to do. > > > > >>> On a related note, there are a number of other places in the system > that > > >>> could benefit from machine learning. Query cache size is one that > comes to > > >>> mind. Currently the query cache size is statically configured, with > a > > >>> default size of 50MB. It would be great to learn the optimal size > for this > > >>> cache to improve query latency. > > > > >>> - Doug > > > > >>> -- > > >>> You received this message because you are subscribed to the Google > Groups > > >>> "Hypertable Development" group. > > >>> To post to this group, send email to [email protected] > . > > >>> To unsubscribe from this group, send email to > > >>> [email protected]<hypertable-dev%[email protected]> > <hypertable-dev%[email protected]<hypertable-dev%[email protected]> > > > > >>> . > > >>> For more options, visit this group at > > >>>http://groups.google.com/group/hypertable-dev?hl=en. > > > > >> -- > > >> Gordon Rios -- Cork Constraint Computation Centre > > >>http://www.4c.ucc.ie/web/people.jsp?id=144 > > >>http://www.linkedin.com/in/gordonrios > > >> Ireland: +353 86 089 2416 > > >> USA: +1 650 906 3473 > > > > >> -- > > >> You received this message because you are subscribed to the Google > Groups > > >> "Hypertable Development" group. > > >> To post to this group, send email to [email protected]. > > >> To unsubscribe from this group, send email to > > >> [email protected]<hypertable-dev%[email protected]> > <hypertable-dev%[email protected]<hypertable-dev%[email protected]> > > > > >> . > > >> For more options, visit this group at > > >>http://groups.google.com/group/hypertable-dev?hl=en. > > -- > You received this message because you are subscribed to the Google Groups > "Hypertable Development" group. > To post to this group, send email to [email protected]. > To unsubscribe from this group, send email to > [email protected]<hypertable-dev%[email protected]> > . > For more options, visit this group at > http://groups.google.com/group/hypertable-dev?hl=en. > > -- You received this message because you are subscribed to the Google Groups "Hypertable Development" group. To post to this group, send email to [email protected]. To unsubscribe from this group, send email to [email protected]. For more options, visit this group at http://groups.google.com/group/hypertable-dev?hl=en.
