Thanks Aitozi for initiating this discussion. For the data cache, I have
some questions:

1. In the design document, the focus is mainly on block cache. In a
complete cache system, it is usually divided into distributed cache, local
file cache, block cache, and key-value cache. Compared with block cache,
would it be more effective to introduce a distributed cache such as Alluxio?

2. For the computing engine: What interfaces should Paimon's cache provide
so that the computing engine can be aware of which computing nodes cache
which data, and facilitate the deployment of computing tasks to the
appropriate computing nodes at the scheduling layer?

Best,
FangYong

On Tue, Jul 16, 2024 at 10:45 AM Aitozi <gjying1...@gmail.com> wrote:

> Hi devs:
>     I want to initiate a discussion on the ability to support data cache in
> the Paimon reader, aiming to accelerate the performance of scan operators
> in analytical scenarios. The detailed design document is as follows [1].
> Looking forward to your feedback.
>
>
> [1]:
>
> https://docs.google.com/document/d/1-zzDpxcubukMR-21n66OPv2ViKEFeEJ_Mivc-wW4gLM/edit?usp=sharing
>
> Thanks
> Aitozi.
>

Reply via email to