[
https://issues.apache.org/jira/browse/ACCUMULO-2846?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Josh Elser resolved ACCUMULO-2846.
----------------------------------
Resolution: Fixed
> Need to re-use DataInputStream for reading files that need replication
> ----------------------------------------------------------------------
>
> Key: ACCUMULO-2846
> URL: https://issues.apache.org/jira/browse/ACCUMULO-2846
> Project: Accumulo
> Issue Type: Sub-task
> Components: replication
> Reporter: Josh Elser
> Assignee: Josh Elser
> Fix For: 1.7.0
>
> Attachments: ingest-graph.jpg, patched-ingest-graph.jpg
>
>
> In doing multi-node tests with continuous ingest, I was watching the ingest
> performance on the peer via the monitor.
> I noticed that the ingest rate had a regular pattern to it, where ingest
> would spike, and then regularly decrease by a (mostly) fixed interval,
> flat-line, and then repeat.
> I believe each cycle on the ingest graph is the replication of a file from
> the primary. The reduction in throughput is relative to the amount of time it
> takes to re-read the "prefix" of the file which we already replicated. I need
> to push some more logic down into the AccumuloReplicaSystem so that we can
> avoid that growing penalty for seeking over the data which we don't need to
> re-process.
> The cost is that it pushes more complexity into the AccumuloReplicaSystem,
> but, I imagine that after I write an implementation to replicate to some
> other system, it would become more obvious where the common points live that
> can be abstracted into a common base class.
--
This message was sent by Atlassian JIRA
(v6.2#6252)