I'm code-reviewing a Phoenix PR [1] right now, which adds Tags to a mutation's Cells in a coproc. A question has come up regarding coprocs and the optional off-heaping of the write path in HBase 2.x and up.
For what parts of the write path (and hence, which coproc hooks) is it safe to change the underlying Cells of a batch mutation without leaking off-heap memory? The HBase book entry on off-heap writes [2] just discusses the ability to make the MemStore off-heap, but HBASE-15179 and its design doc[3] say that the entire write stack is off-heap. Why this matters is if in a RegionObserver coproc hook (that's before the MemStore commit) the mutation Cells can be assumed to be on-heap, then clearing the internal family map of the mutation and replacing them with new, altered Cells is safe. (Extra GC pressure aside, of course.) If not, I presume the coproc would be leaking off-heap memory (unless there's magic cleanup somewhere?) If this is not a safe assumption, what would the recommended way be to alter a Cell's Tags in a coproc, since Tags are explicitly not exposed to the HBase client, Cells are immutable, and hence the only way to do so would be to create new Cells in a coproc? My question's not how to create the new Cells (that's been answered elsewhere) but how to dispose of the old, original ones. Also, if this is not a safe assumption, is there an accepted LP(Coproc) or Public API that a coproc can check to see if it's in an "off-heap" mode or not so that a leak can be avoided? Thanks, Geoffrey Jacoby References: [1] https://github.com/apache/phoenix/pull/978 [2] https://hbase.apache.org/book.html#regionserver.offheap.writepath [3] https://docs.google.com/document/d/1fj5P8JeutQ-Uadb29ChDscMuMaJqaMNRI86C4k5S1rQ/edit
