[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-12-22 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799931#comment-17799931
 ] 

Matthias Pohl commented on FLINK-33623:
---

Unfortunately, I haven't had the time, yet, to look into it. But it's on my 
list of items to check after the holiday week.

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
>  Labels: metaspace-leak
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-12-22 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17799699#comment-17799699
 ] 

gabrywu commented on FLINK-33623:
-

[~mapohl] any updates?

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
>  Labels: metaspace-leak
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791480#comment-17791480
 ] 

Matthias Pohl commented on FLINK-33623:
---

Thanks for the pointers. I'm gonna have a look.

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>  Components: API / Core
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
>  Labels: metaspace-leak
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791413#comment-17791413
 ] 

gabrywu commented on FLINK-33623:
-

` classloader.check-leaked-classloader = false` causes ChildFirstClassLoader 
leaks. and if it's true, SafetyNetWrapperClassLoader leaks

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-29 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17791412#comment-17791412
 ] 

gabrywu commented on FLINK-33623:
-

[~mapohl] it's not related to FLINK-25023. 

you can start a minimal flink cluster on your desktop, and submit an example 
`$FLINK_HOME/examples/batch/WordCount.jar`, and this leak will be there.

Flink Netty Client is using 
`org.apache.flink.runtime.io.network.netty.NettyServer.THREAD_FACTORY_BUILDER` 
to create a thread when flink job starts, this factory will create a thread 
with parent context class loader as its `contextClassLoader`

Here is a clue.

!image.png!

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png, image.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-28 Thread Matthias Pohl (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790585#comment-17790585
 ] 

Matthias Pohl commented on FLINK-33623:
---

Thanks for raising this issue, [~gabry.wu]. Can you provide a minimal Flink 
example that makes this reproducible? It's hard to identify where the 
classloader is coming from (I couldn't find any code segment where the 
classloading is explicitly set). I'm wondering whether it's related to 
FLINK-25023. In FLINK-25023 we experience a classloader leak which is caused by 
the fact that certain user code might trigger the thread creation (in the case 
of FLINK-33623: the thread that's used by the netty client) where the thread 
derives the classloader from the user code (see the [FLINK-25023 PR 
comment|https://github.com/apache/flink/pull/17916#discussion_r761288714] from 
[~dmvk] for a more detailed description of what's going on) 

What do you mean by the following text snippet in the issue description:
{quote}TIPs

classloader.check-leaked-classloader = false
{quote}

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-27 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790066#comment-17790066
 ] 

Martijn Visser commented on FLINK-33623:


[~mapohl] WDYT?

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-27 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17790058#comment-17790058
 ] 

gabrywu commented on FLINK-33623:
-

I change value of Flink Netty Client to null, everything goes well. 
[~martijnvisser] 

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> TIPs
>  classloader.check-leaked-classloader = false
>  
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-23 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789133#comment-17789133
 ] 

gabrywu commented on FLINK-33623:
-

[~martijnvisser] 1.18.0 still has the same `issue`

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5, 1.18.0
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-23 Thread Martijn Visser (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17789024#comment-17789024
 ] 

Martijn Visser commented on FLINK-33623:


[~gabry.wu] Please validate with the latest version of Flink, since 1.13 isn't 
supported in the community anymore

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-22 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788979#comment-17788979
 ] 

gabrywu commented on FLINK-33623:
-

Flink Netty Client and Fink Netty Server uses different classloader, Netty 
Server uses sun.misc.Launcher$AppClassLoader, however Flink Netty Client use 
org.apache.flink.runtime.execution.librarycache.FlinkUserCodeClassLoaders

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-22 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788963#comment-17788963
 ] 

gabrywu commented on FLINK-33623:
-

I think Flink Netty Client and Fink Netty Server have nothing to do with user 
code, so it's better its thread factory uses app class loader.

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)


[jira] [Commented] (FLINK-33623) Metaspce leak caused by Flink Netty Client thread

2023-11-22 Thread gabrywu (Jira)


[ 
https://issues.apache.org/jira/browse/FLINK-33623?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel=17788959#comment-17788959
 ] 

gabrywu commented on FLINK-33623:
-

Flink Netty Server is created successfully once TASKMANAGER starts, and there 
is not any Flink Netty Client thread created.

!image-2023-11-23-10-48-43-070.png|width=1305,height=168!

> Metaspce leak caused by Flink Netty Client thread
> -
>
> Key: FLINK-33623
> URL: https://issues.apache.org/jira/browse/FLINK-33623
> Project: Flink
>  Issue Type: Bug
>Affects Versions: 1.13.5
>Reporter: gabrywu
>Priority: Minor
> Attachments: image-2023-11-23-09-47-50-536.png, 
> image-2023-11-23-10-48-43-070.png
>
>
> Hi, folks, 
> We found that there is a Flink Netty Client thread with contextClassLoader 
> `ChildFirstClassLoader`, and it causes a metaspace leak.
> !image-2023-11-23-09-47-50-536.png|width=1175,height=651!



--
This message was sent by Atlassian Jira
(v8.20.10#820010)