Hi
Anoop has clearly answered I believe. the short answer is in your CP it is
better you copy/clone the cells so that there is no reference. I believe
the Index related WAL codec in Phoenix was also trying to do something
similar if I remember correctly. (I may be wrong though).

Regards
Ram

On Thu, Dec 3, 2020 at 9:30 AM Anoop John <[email protected]> wrote:

> Hi Geoffrey,
>
> In case of off heap backed write path (RPC layer itself), the write payload
> is accepted into DBBs that we get from a pool.  And cells will be created
> over this DBB. In case we add Tags in CPs, there will be a new Cell POJO
> created.  But that will anyways refer to old POJO for all parts except
> Tags. See TagRewriteCell for eg:  Anyways, when we add cells to Memsore,
> then only we retrieve it from this RPC side buffer.   In the write path,
> once the call completes and comes back to RPC layer there we will release
> the buffer. So there should not be a worry of a leak.
> The only thing to be careful in CPs, is if you keep reference to Cells.  In
> such cases, it's advised to clone the cell (or parts of it) and keep that
> reference.  When RPC side we used pooled DBB, not doing this correctly can
> cause the Cell being corrupted later. (The buffer would be released once
> RPC call is over and later would be used to read some other write payload)
> Even in case of on heap buffer usage at RPC, keeping such ref without clone
> can cause issues as it will not allow the RPC payload read buffer (much
> larger size than a cell size typically) to get GCed. Anyways I know Phoenix
> Jira's aim is to create Cells with addition of tags , am saying it just as
> a pointer.
>
> Anoop
>
> On Thu, Dec 3, 2020 at 2:41 AM Geoffrey Jacoby <[email protected]> wrote:
>
> > I'm code-reviewing a Phoenix PR [1] right now, which adds Tags to a
> > mutation's Cells in a coproc. A question has come up regarding coprocs
> and
> > the optional off-heaping of the write path in HBase 2.x and up.
> >
> > For what parts of the write path (and hence, which coproc hooks) is it
> safe
> > to change the underlying Cells of a batch mutation without leaking
> off-heap
> > memory?
> >
> > The HBase book entry on off-heap writes [2] just discusses the ability to
> > make the MemStore off-heap, but HBASE-15179 and its design doc[3] say
> that
> > the entire write stack is off-heap.
> >
> > Why this matters is if in a RegionObserver coproc hook (that's before the
> > MemStore commit) the mutation Cells can be assumed to be on-heap, then
> > clearing the internal family map of the mutation and replacing them with
> > new, altered Cells is safe. (Extra GC pressure aside, of course.) If
> not, I
> > presume the coproc would be leaking off-heap memory (unless there's magic
> > cleanup somewhere?)
> >
> > If this is not a safe assumption, what would the recommended way be to
> > alter a Cell's Tags in a coproc, since Tags are explicitly not exposed to
> > the HBase client, Cells are immutable, and hence the only way to do so
> > would be to create new Cells in a coproc? My question's not how to create
> > the new Cells (that's been answered elsewhere) but how to dispose of the
> > old, original ones.
> >
> > Also, if this is not a safe assumption, is there an accepted LP(Coproc)
> or
> > Public API that a coproc can check to see if it's in an "off-heap" mode
> or
> > not so that a leak can be avoided?
> >
> > Thanks,
> >
> > Geoffrey Jacoby
> >
> > References:
> > [1] https://github.com/apache/phoenix/pull/978
> > [2] https://hbase.apache.org/book.html#regionserver.offheap.writepath
> > [3]
> >
> >
> https://docs.google.com/document/d/1fj5P8JeutQ-Uadb29ChDscMuMaJqaMNRI86C4k5S1rQ/edit
> >
>

Reply via email to