I'm code-reviewing a Phoenix PR [1] right now, which adds Tags to a
mutation's Cells in a coproc. A question has come up regarding coprocs and
the optional off-heaping of the write path in HBase 2.x and up.

For what parts of the write path (and hence, which coproc hooks) is it safe
to change the underlying Cells of a batch mutation without leaking off-heap
memory?

The HBase book entry on off-heap writes [2] just discusses the ability to
make the MemStore off-heap, but HBASE-15179 and its design doc[3] say that
the entire write stack is off-heap.

Why this matters is if in a RegionObserver coproc hook (that's before the
MemStore commit) the mutation Cells can be assumed to be on-heap, then
clearing the internal family map of the mutation and replacing them with
new, altered Cells is safe. (Extra GC pressure aside, of course.) If not, I
presume the coproc would be leaking off-heap memory (unless there's magic
cleanup somewhere?)

If this is not a safe assumption, what would the recommended way be to
alter a Cell's Tags in a coproc, since Tags are explicitly not exposed to
the HBase client, Cells are immutable, and hence the only way to do so
would be to create new Cells in a coproc? My question's not how to create
the new Cells (that's been answered elsewhere) but how to dispose of the
old, original ones.

Also, if this is not a safe assumption, is there an accepted LP(Coproc) or
Public API that a coproc can check to see if it's in an "off-heap" mode or
not so that a leak can be avoided?

Thanks,

Geoffrey Jacoby

References:
[1] https://github.com/apache/phoenix/pull/978
[2] https://hbase.apache.org/book.html#regionserver.offheap.writepath
[3]
https://docs.google.com/document/d/1fj5P8JeutQ-Uadb29ChDscMuMaJqaMNRI86C4k5S1rQ/edit

Reply via email to