[
https://issues.apache.org/jira/browse/ACCUMULO-519?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=13577870#comment-13577870
]
Keith Turner edited comment on ACCUMULO-519 at 2/13/13 7:52 PM:
----------------------------------------------------------------
Two things to consider
* Isolation : need a strategy for handling this. Currently isolated reads
keep all rfiles they are currently reading a row from around, even rfiles there
were compacted away.
* Memory allocation : need a new strategy for deciding maximum memory that can
be used by in memory maps since will have to read and write to memory for these
in memory compactions.
was (Author: kturner):
Two things to consider
* Isolation : need a strategy for handling this. Currently isolated reads
keep all rfiles they are currently reading a row from around.
* Memory allocation : need a new strategy for deciding maximum memory that can
be used by in memory maps since will have to read and write to memory for these
in memory compactions.
> support in-memory compactions
> -----------------------------
>
> Key: ACCUMULO-519
> URL: https://issues.apache.org/jira/browse/ACCUMULO-519
> Project: Accumulo
> Issue Type: Improvement
> Components: tserver
> Reporter: Adam Fuchs
> Assignee: Adam Fuchs
>
> There are several factors that influence how big to make the in-memory write
> buffer (tserver.memory.maps.max) for Accumulo. Two dominant factors that
> conflict with each other are:
> # Overall disk I/O depends somewhat on the log of the ratio of tablet size to
> initial file size. Bigger write buffer leads to bigger initial files, and can
> lead to less overall disk I/O.
> # Aggregation, versioning, and deleting take place in the iterator tree,
> which only applies during compactions and scans. The in-memory write buffer
> can buffer many versions of a given key, and scans can be slow if compactions
> are infrequent.
> One solution would be to run some sort of stepped compaction in-memory, in
> which the iterator tree is applied in some sort of log-structured fashion. We
> can consider the minor compaction to be two pipelined steps: serialization of
> map entries, and writing the serialized form to disk. After we have written
> the serialized form to disk, we can free up the write-ahead logs associated
> with that data.
> I propose the following:
> # We should buffer the serialized RFile form in-memory instead of writing it
> to disk (call it a micro-compaction).
> # We should implement a merging step for merging existing buffered RFiles
> with newly serialized buffers, using the same algorithm that we use for major
> compaction file selection.
> # The in-memory buffer should be micro-compacted aggressively (whenever we
> have a thread free, with some minimum allocation of CPU and memory I/O
> resources to this task).
> # The current triggers that we use for minor compactions should be used to
> select buffered RFiles from memory and dump them to disk, at which point we
> can drop the write-ahead log references.
> Overall this will allow users to keep the initial files generated by minor
> compactions large while alleviating the second concern of buffering too many
> versions of the same key. Two use cases that will benefit greatly for this
> are ACCUMULO-348 (lots of updates to the default tablet info in the !METADATA
> table), and aggregation in which there are a small number of keys. Other
> considerations that also affect this space are:
> # RFiles are column-oriented (with locality groups), while the in-memory map
> is only row oriented. Moving to a column-oriented structure sooner would
> benefit some queries.
> # RFiles are optimized for sequential access while the in-memory write buffer
> requires lots of random memory access to read a stream of key/value pairs in
> key order.
> # RFiles use configurable compression, while the in-memory map only uses
> hierarchical organization. RFiles generally get better compression.
> # Currently, writing a column-oriented RFile requires scanning the entire
> in-memory map for each locality group. Bigger in-memory maps can take a long
> time to re-order for minor compaction.
> # Memory fragmentation and garbage collection in the JVM are big concerns
> that a lot of work has gone into. We need to be considerate of those factors
> in implementing this change.
--
This message is automatically generated by JIRA.
If you think it was sent incorrectly, please contact your JIRA administrators
For more information on JIRA, see: http://www.atlassian.com/software/jira