Hi Geoffrey,

In case of off heap backed write path (RPC layer itself), the write payload
is accepted into DBBs that we get from a pool.  And cells will be created
over this DBB. In case we add Tags in CPs, there will be a new Cell POJO
created.  But that will anyways refer to old POJO for all parts except
Tags. See TagRewriteCell for eg:  Anyways, when we add cells to Memsore,
then only we retrieve it from this RPC side buffer.   In the write path,
once the call completes and comes back to RPC layer there we will release
the buffer. So there should not be a worry of a leak.
The only thing to be careful in CPs, is if you keep reference to Cells.  In
such cases, it's advised to clone the cell (or parts of it) and keep that
reference.  When RPC side we used pooled DBB, not doing this correctly can
cause the Cell being corrupted later. (The buffer would be released once
RPC call is over and later would be used to read some other write payload)
Even in case of on heap buffer usage at RPC, keeping such ref without clone
can cause issues as it will not allow the RPC payload read buffer (much
larger size than a cell size typically) to get GCed. Anyways I know Phoenix
Jira's aim is to create Cells with addition of tags , am saying it just as
a pointer.

Anoop

On Thu, Dec 3, 2020 at 2:41 AM Geoffrey Jacoby <[email protected]> wrote:

> I'm code-reviewing a Phoenix PR [1] right now, which adds Tags to a
> mutation's Cells in a coproc. A question has come up regarding coprocs and
> the optional off-heaping of the write path in HBase 2.x and up.
>
> For what parts of the write path (and hence, which coproc hooks) is it safe
> to change the underlying Cells of a batch mutation without leaking off-heap
> memory?
>
> The HBase book entry on off-heap writes [2] just discusses the ability to
> make the MemStore off-heap, but HBASE-15179 and its design doc[3] say that
> the entire write stack is off-heap.
>
> Why this matters is if in a RegionObserver coproc hook (that's before the
> MemStore commit) the mutation Cells can be assumed to be on-heap, then
> clearing the internal family map of the mutation and replacing them with
> new, altered Cells is safe. (Extra GC pressure aside, of course.) If not, I
> presume the coproc would be leaking off-heap memory (unless there's magic
> cleanup somewhere?)
>
> If this is not a safe assumption, what would the recommended way be to
> alter a Cell's Tags in a coproc, since Tags are explicitly not exposed to
> the HBase client, Cells are immutable, and hence the only way to do so
> would be to create new Cells in a coproc? My question's not how to create
> the new Cells (that's been answered elsewhere) but how to dispose of the
> old, original ones.
>
> Also, if this is not a safe assumption, is there an accepted LP(Coproc) or
> Public API that a coproc can check to see if it's in an "off-heap" mode or
> not so that a leak can be avoided?
>
> Thanks,
>
> Geoffrey Jacoby
>
> References:
> [1] https://github.com/apache/phoenix/pull/978
> [2] https://hbase.apache.org/book.html#regionserver.offheap.writepath
> [3]
>
> https://docs.google.com/document/d/1fj5P8JeutQ-Uadb29ChDscMuMaJqaMNRI86C4k5S1rQ/edit
>

Reply via email to