Re: [DISCUSS] PIP-1: Improve Shared Writer Buffer Pool For Sink

Shammon FY Sun, 23 Apr 2023 02:51:22 -0700

Thanks for all the feedbacks.

To Jingsong
> Maybe we can just use some reflection method to get offHeapBuffer and
heapMemory from Flink MemorySegment.


Yes, we can indeed construct a MemorySegment for Paimon in this way, but
this method may have duplicate release issues for the segment. Assuming
Flink has applied for an off-heap memory, Paimon gets the off-heap buffer
and creates its MemorySegment, then the off-heap buffer will be in Flink
MemorySegment and Paimon MemorySegment. When the off-heap buffer is
released in Paimon with `UNSAFE.freeMemory(this.address)`, it may be
released again in Flink's MemorySegment.

To Liming
> Is it possible to cause a deadlock when requesting for memory from the
engine's managed memory? Is it necessary to add some memory checking or
timeout mechanism here?

Flink allocates segments for parallel tasks in MemoryManager. When the
usage of memory in MemoryManager hits the limit, it will throw Exception

To Guojun
> One question I'm thinking about is that will this increase the bar of
writing performance maintenance on Paimon? Like how to decide an
appropriate memory weight for users' writing jobs.

Currently, users can configure managed memory weights for AGG and window
operators for Flink jobs, this is similar to the writer buffer pool weight
configured in Paimon. So for Flink users, I think this will not be a
problem.

Best,
Shammon FY


On Fri, Apr 21, 2023 at 11:42 AM Guojun Li <[email protected]> wrote:

> Hi Shammon,
>
> Thank you for writing up the proposal. It's great to introduce this unified
> memory management for Paimon!
>
> One question I'm thinking about is that will this increase the bar of
> writing performance maintenance on Paimon? Like how to decide an
> appropriate memory weight for users' writing jobs.
>
> Thanks,
> Guojun
>
>
>
>
> On Thu, Apr 20, 2023 at 8:45 PM Ming Li <[email protected]> wrote:
>
> > Thanks Shammon for the proposal.
> >
> > For me it is more appropriate to leave the memory management to the
> > computing engine.
> >
> > But I have a small question about this proposal. If the engine's memory
> is
> > not configured properly, is it possible to cause a deadlock when
> requesting
> > for memory from the engine's managed memory? Is it necessary to add some
> > memory checking or timeout mechanism here?
> >
> >
> > Thanks,
> > Ming Li
> >
> >
> > Shammon FY <[email protected]> 于2023年4月19日周三 09:57写道：
> >
> > > Hi devs:
> > >
> > > I would like to start a discussion of PIP-1: Improve Shared Writer
> Buffer
> > > Pool For Sink [1]. Currently Paimon sink task creates a heap memory
> pool
> > > which is shared by writers. When there are multiple tasks in an
> Executor,
> > > it may cause FullGC, performance issues and even OOM.
> > >
> > > This PIP aims to improve the buffer pool for writers in Paimon tasks.
> > > Paimon tasks can create memory pools based on Executor Memory which
> will
> > be
> > > managed by Executor, such as Managed Memory in Flink TaskManager. It
> will
> > > improve the stability and performance of sinks by managing writer
> buffers
> > > for multiple tasks through Executor.
> > >
> > > Looking forward to your feedback, thanks.
> > >
> > >
> > > [1]
> > >
> > >
> >
> https://cwiki.apache.org/confluence/display/PAIMON/PIP-1%3A+Improve+Shared+Writer+Buffer+Pool+For+Sink
> > >
> > >
> > > Best,
> > > Shammon FY
> > >
> >
>

Re: [DISCUSS] PIP-1: Improve Shared Writer Buffer Pool For Sink

Reply via email to