Take a look at the NetworkTopologyStrategy and/or the RackInferringSnitch together they decide where to place replicas. It's probably not a great idea to muck around with this stuff though.
How about a hadoop job to pull out the data you want? It would be a full scan but in parallel. Aaron On 22/02/2011, at 3:10 AM, Héctor Izquierdo Seliva <[email protected]> wrote: > > Hi all. > > Is there a way (besides changing the code) to replicate data from a Data > center 1 to a Data center 2, but not the other way around? I need to > have a preproduction environment with production data, and ideally with > only a fraction of the data (for example, by key preffixes). I have > poked around StorageProxy and I can make writes in DC2 not replicate to > DC1, and as long as I use DC_QUORUM it stays that way, but it > looks...dangerous. I could do a full key scan but it would take too > long. > > Have anybody done something similar? > > Thanks! > >
