Thanks Aitozi for initiating this discussion. For the data cache, I have some questions:
1. In the design document, the focus is mainly on block cache. In a complete cache system, it is usually divided into distributed cache, local file cache, block cache, and key-value cache. Compared with block cache, would it be more effective to introduce a distributed cache such as Alluxio? 2. For the computing engine: What interfaces should Paimon's cache provide so that the computing engine can be aware of which computing nodes cache which data, and facilitate the deployment of computing tasks to the appropriate computing nodes at the scheduling layer? Best, FangYong On Tue, Jul 16, 2024 at 10:45 AM Aitozi <gjying1...@gmail.com> wrote: > Hi devs: > I want to initiate a discussion on the ability to support data cache in > the Paimon reader, aiming to accelerate the performance of scan operators > in analytical scenarios. The detailed design document is as follows [1]. > Looking forward to your feedback. > > > [1]: > > https://docs.google.com/document/d/1-zzDpxcubukMR-21n66OPv2ViKEFeEJ_Mivc-wW4gLM/edit?usp=sharing > > Thanks > Aitozi. >