Josh Elser created ACCUMULO-3232:
------------------------------------

             Summary: Improve consumption of WAL header in partial replication 
case
                 Key: ACCUMULO-3232
                 URL: https://issues.apache.org/jira/browse/ACCUMULO-3232
             Project: Accumulo
          Issue Type: Improvement
          Components: replication
            Reporter: Josh Elser
            Assignee: Josh Elser
             Fix For: 1.7.0


Consider a system that is actively replicating from one instance to another. 
Specifically, assume there is one WAL that is currently being replicated to the 
destination and the source instance is shutdown.

When the source instance is restarted, it will notice that the WAL has read 
through N {{LogFileKey}}/{{LogFileValue}} pairs (from before it was shutdown) 
and while proceed past these records to get to the data in the file which it 
needs to read.

We have to re-read each of these pairs from the file because the WAL is an 
append-only structure, and we can't efficiently seek to some point in the file, 
as we wouldn't know how to correlate the byte offset to entries.

As we read the WAL, in addition (or perhaps instead of) tracking the offset in 
the WAL, it would be good to track the correlation of N bytes read to M records 
consumed which would help us better resume replication.



--
This message was sent by Atlassian JIRA
(v6.3.4#6332)

Reply via email to