I don't know of any papers off hand, but I would think you could go down two routes. A predictive trend algo to 'guess' which blocks could get hot based on seasonal traffic and a reactive one based on response time regularized by #replicas it is on.
Sent from my iPhone On 12/09/2012, at 2:21 PM, Worthy LaFollette <[email protected]> wrote: > As Ian explained down thread, the paper gave two examples. The first was > static seeding of duplicates, the second was dynamic with a suggestion of a > monitor which seeds additional copies based on some algorithm in response > to "hot" queries (China being the topic of the example given). I am > curious if anyone was aware of any papers about this second part. I can > almost see a cost model where the query measures the overall cost of a > query (latency, risk of latency?) and then generates copies in response. > Part of this of course would be a recovery mechanism which removes these > extra copies. > > W- > > On Tue, Sep 11, 2012 at 9:31 PM, Ted Dunning <[email protected]> wrote: > >> What do you mean be selective replication? >> >> On Tue, Sep 11, 2012 at 7:23 PM, Worthy LaFollette <[email protected] >>> wrote: >> >>> Very good paper. Am curious now to the strategies for selective >>> replication, which looks if done right would make the query generation >> more >>> efficient. Do you know of any papers on that subject? >>> >>> On Tue, Sep 11, 2012 at 1:37 PM, Ted Dunning <[email protected]> >>> wrote: >>> >>>> Headed into Thursday's meetup, this paper by Jeff Dean provides a very >>> good >>>> description of strategies for getting fast response times with variable >>>> quality infrastructure. >>>> >>>> http://research.google.com/people/jeff/latency.html >>>> >>>> The key point here is that it is very important to have asynchronous >>>> queries with a cancel. Above that level, there needs to be a simple >>>> strategy for pushing second versions of queries out to the workers and >>>> canceling defunct or redundant queries. >>
