> On April 2, 2014, 3:15 p.m., Josh Elser wrote:
> > docs/src/main/resources/design/ACCUMULO-378-design.mdtext, line 62
> > <https://reviews.apache.org/r/19862/diff/1/?file=543190#file543190line62>
> >
> >     Thinking about this from a total ordering standpoint. Say we're 
> > replicating to two slaves, and we have three rfiles to replicate (1, 2 and 
> > 3) to those two slaves.
> >     
> >     We replicate rfile1 to both, but then the link to slave2 goes down. We 
> > can still replicate rfile2 and then rfile3 to slave1, while we try to send 
> > rfile2 to slave2.
> >     
> >     What, if instead of the link being down, we happen to communicate to an 
> > angry server inside of slave2 which never completes the transfer. We don't 
> > want to transfer rfile3 to attempt to better preserve global ordering.
> >     
> >     This can be restated as "we only want to replicate one 'file' to a 
> > slave at a time" so that we preserve the original semantics of the 
> > replication "queue" (table). The problem is that this could drastically 
> > slow down replication when the link between master and slave cannot be 
> > saturated by one replication task at a time.
> >     
> >     This isn't anything that we can reliably guarantee now (without 
> > conditional mutations), right? Is it worth trying to tackle? The one clear 
> > change I want to make is that we do want to put the identifier for the 
> > slave in with the replication record rather than defer determination of 
> > where a record should be replicated.

I also think transferring files should be an external concern like Mike Said.  
One way this could work is the following.

 1. Cluster A exports a batch of file uris and a control file (similar to 
export table)
 2. The user distcps the uris and control file
 3. The control file and dir containing distcp files is provided to cluter B to 
import 

The difference between this and import/export table is thats its stateful.  
Export on Cluster A provides the list of changes since the last export.  The 
control file contains ordering information about how to apply the files.  The 
control file also contains ordering information about other import/exports.   
But this process is incomplete.  The feedback process would need to be worked 
out.  The entire process should be resiliant to users trying to apply things 
out of order.


- kturner


-----------------------------------------------------------
This is an automatically generated e-mail. To reply, visit:
https://reviews.apache.org/r/19862/#review39260
-----------------------------------------------------------


On April 1, 2014, 1:58 a.m., Josh Elser wrote:
> 
> -----------------------------------------------------------
> This is an automatically generated e-mail. To reply, visit:
> https://reviews.apache.org/r/19862/
> -----------------------------------------------------------
> 
> (Updated April 1, 2014, 1:58 a.m.)
> 
> 
> Review request for accumulo.
> 
> 
> Bugs: ACCUMULO-378
>     https://issues.apache.org/jira/browse/ACCUMULO-378
> 
> 
> Repository: accumulo
> 
> 
> Description
> -------
> 
> Re-posting a version of the design doc that I own. Contains grammatical fixes 
> from round one, with a few extra clarifications. New content should be posted 
> here, but I'll maintain the old review as discussion progresses.
> 
> 
> Diffs
> -----
> 
>   docs/src/main/resources/design/ACCUMULO-378-design.mdtext PRE-CREATION 
> 
> Diff: https://reviews.apache.org/r/19862/diff/
> 
> 
> Testing
> -------
> 
> 
> Thanks,
> 
> Josh Elser
> 
>

Reply via email to