[ 
https://issues.apache.org/jira/browse/FLINK-33317?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Rui Fan updated FLINK-33317:
----------------------------
    Description: 
See FLINK-33315 for details.

This Jira focus on avoid unnecessary memory usage, it can optimize the memory 
cost of Replica_3 in FLINK-33315.

 

Solution:
 * Define a threshold for cleaning mechanism of StreamConfig#SERIALIZEDUDF 
 * After getStreamOperatorFactory,  the StreamConfig#SERIALIZEDUDF can be 
cleaned

In general, we don't clean any configuration. However, the SERIALIZED_UDF may 
be very large when operator includes some large objects.

{@link #getStreamOperatorFactory} is used to create a StreamOperator and 
usually only needs to be called once.

Callers can clean it to reduce the memory after calling the \{@link 
#getStreamOperatorFactory}.

  was:
See FLINK-33315 for details.

This Jira focus on avoid unnecessary memory usage, it can optimize the memory 
cost of Replica_3 in FLINK-33315.

 

Solution:
 * Define a threshold for cleaning mechanism of StreamConfig#SERIALIZEDUDF 
 * After getStreamOperatorFactory,  the StreamConfig#SERIALIZEDUDF can be 
cleaned if the value size is great than threshold
 * If it still needed later, we can flush it to disk, and load it from disk if 
needed(And clean it in memory immediately after using).


> Add cleaning mechanism for StreamConfig#SERIALIZEDUDF
> -----------------------------------------------------
>
>                 Key: FLINK-33317
>                 URL: https://issues.apache.org/jira/browse/FLINK-33317
>             Project: Flink
>          Issue Type: Sub-task
>          Components: Runtime / Configuration
>    Affects Versions: 1.17.0, 1.18.0
>            Reporter: Rui Fan
>            Assignee: Rui Fan
>            Priority: Major
>
> See FLINK-33315 for details.
> This Jira focus on avoid unnecessary memory usage, it can optimize the memory 
> cost of Replica_3 in FLINK-33315.
>  
> Solution:
>  * Define a threshold for cleaning mechanism of StreamConfig#SERIALIZEDUDF 
>  * After getStreamOperatorFactory,  the StreamConfig#SERIALIZEDUDF can be 
> cleaned
> In general, we don't clean any configuration. However, the SERIALIZED_UDF may 
> be very large when operator includes some large objects.
> {@link #getStreamOperatorFactory} is used to create a StreamOperator and 
> usually only needs to be called once.
> Callers can clean it to reduce the memory after calling the \{@link 
> #getStreamOperatorFactory}.



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to