I don't know of any papers off hand, but I would think you could go down two 
routes. A predictive trend algo to 'guess' which blocks could get hot based on 
seasonal traffic and a reactive one based on response time regularized by 
#replicas it is on.  

Sent from my iPhone

On 12/09/2012, at 2:21 PM, Worthy LaFollette <[email protected]> wrote:

> As Ian explained down thread, the paper gave two examples.  The first was
> static seeding of duplicates, the second was dynamic with a suggestion of a
> monitor which seeds additional copies based on some algorithm in response
> to "hot" queries (China being the topic of the example given).  I am
> curious if anyone was aware of any papers about this second part.  I can
> almost see a cost model where the query measures the overall cost of a
> query (latency, risk of latency?) and then generates copies in response.
> Part of this of course would be a recovery mechanism which removes these
> extra copies.
> 
> W-
> 
> On Tue, Sep 11, 2012 at 9:31 PM, Ted Dunning <[email protected]> wrote:
> 
>> What do you mean be selective replication?
>> 
>> On Tue, Sep 11, 2012 at 7:23 PM, Worthy LaFollette <[email protected]
>>> wrote:
>> 
>>> Very good paper. Am curious now to the strategies for selective
>>> replication, which looks if done right would make the query generation
>> more
>>> efficient.  Do you know of any papers on that subject?
>>> 
>>> On Tue, Sep 11, 2012 at 1:37 PM, Ted Dunning <[email protected]>
>>> wrote:
>>> 
>>>> Headed into Thursday's meetup, this paper by Jeff Dean provides a very
>>> good
>>>> description of strategies for getting fast response times with variable
>>>> quality infrastructure.
>>>> 
>>>> http://research.google.com/people/jeff/latency.html
>>>> 
>>>> The key point here is that it is very important to have asynchronous
>>>> queries with a cancel.  Above that level, there needs to be a simple
>>>> strategy for pushing second versions of queries out to the workers and
>>>> canceling defunct or redundant queries.
>> 

Reply via email to