[
https://issues.apache.org/jira/browse/FLINK-16392?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
Chen Qin closed FLINK-16392.
----------------------------
Resolution: Feedback Received
> oneside sorted cache in intervaljoin
> ------------------------------------
>
> Key: FLINK-16392
> URL: https://issues.apache.org/jira/browse/FLINK-16392
> Project: Flink
> Issue Type: Improvement
> Components: API / DataStream
> Affects Versions: 1.10.0
> Reporter: Chen Qin
> Priority: Major
> Labels: pull-request-available
> Fix For: 1.11.0
>
> Time Spent: 10m
> Remaining Estimate: 0h
>
> Native intervaljoin rely on statebackend(e.g rocksdb) to insert/fetch left
> and right buffer. This design choice reduce minimize heap memory footprint
> while bounded process throughput of single taskmanager iops to rocksdb access
> speed. Here at Pinterest, we have some large use cases where developers join
> large and slow evolving data stream (e.g post updates in last 28 days) with
> web traffic datastream (e.g post views up to 28 days after given update).
> This post some challenge to current implementation of intervaljoin
> * partitioned rocksdb needs to keep both updates and views for 28 days,
> large buffer(especially view stream side) cause rocksdb slow down and lead to
> overall interval join performance degregate quickly as state build up.
> * view stream is web scale, even after setting large parallelism it can put
> lot of pressure on each subtask and backpressure entire job
> In proposed implementation, we plan to introduce two changes
> * support ProcessJoinFunction settings to opt-in earlier cleanup time of
> right stream(e.g view stream don't have to stay in buffer for 28 days and
> wait for update stream to join, related post views happens after update in
> event time semantic) This optimization can reduce state size to improve
> rocksdb throughput. If extreme case, user can opt-in in flight join and skip
> write into right view stream buffer to save iops budget on each subtask
> * support ProcessJoinFunction settings to expedite keyed lookup of slow
> changing stream. Instead of every post view pull post updates from rocksdb.
> user can opt-in and having one side buffer cache available in memory. If a
> given post update, cache load recent views from right buffer and use
> sortedMap to find buckets. If a given post view, cache load recent updates
> from left buffer to memory. When another view for that post arrives, flink
> save cost of rocksdb access.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)