Re: Where is the "Partitioned All Cache" doc?

Qingsheng Ren Mon, 28 Mar 2022 01:09:14 -0700

Hi, 

The optimization you mentioned is only applicable for the product provided by 
Alibaba Cloud. In open-source Apache Flink there isn’t a unique caching 
abstraction for all lookup tables, and each connector has there own cache 
implementation. For example JDBC uses Guava cache and FileSystem uses in-memory 
HashMap, and both of them don’t load all records in dim table into the cache.


Best, 

Qingsheng


> On Mar 28, 2022, at 12:26, dz902 <dz9...@gmail.com> wrote:
> 
> Hi,
> 
> I've read some docs
> (https://help.aliyun.com/document_detail/182011.html) stating Flink
> optimization technique using:
> 
> - partitionedJoin = 'true'
> - cache = 'ALL'
> - blink.partialAgg.enabled=true
> 
> However I could not find any official doc references. Are these
> supported at all?
> 
> Also "partitionedJoin" seemed to have the effect of shuffling input by
> joining key so they can fit into memory. I read this
> (https://flink.apache.org/news/2015/03/13/peeking-into-Apache-Flinks-Engine-Room.html)
> and believes this is already a default behavior of Flink.
> 
> Is this optimization not needed even for huge input tables?
> 
> Thanks,
> Dai

Re: Where is the "Partitioned All Cache" doc?

Reply via email to