Josh Elser created ACCUMULO-3232:
------------------------------------
Summary: Improve consumption of WAL header in partial replication
case
Key: ACCUMULO-3232
URL: https://issues.apache.org/jira/browse/ACCUMULO-3232
Project: Accumulo
Issue Type: Improvement
Components: replication
Reporter: Josh Elser
Assignee: Josh Elser
Fix For: 1.7.0
Consider a system that is actively replicating from one instance to another.
Specifically, assume there is one WAL that is currently being replicated to the
destination and the source instance is shutdown.
When the source instance is restarted, it will notice that the WAL has read
through N {{LogFileKey}}/{{LogFileValue}} pairs (from before it was shutdown)
and while proceed past these records to get to the data in the file which it
needs to read.
We have to re-read each of these pairs from the file because the WAL is an
append-only structure, and we can't efficiently seek to some point in the file,
as we wouldn't know how to correlate the byte offset to entries.
As we read the WAL, in addition (or perhaps instead of) tracking the offset in
the WAL, it would be good to track the correlation of N bytes read to M records
consumed which would help us better resume replication.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)