Re: How to debug Metaspace exception?

2022-05-02 Thread John Smith
Ok, I don't think I'm running user code on the job manager. Basically. I'm running a standalone cluster. 3 zookeepers 3 job managers 3 task managers. I submit my jobs via the UI. But in case I'll copy the config iver to the job managers. On Mon, May 2, 2022 at 11:00 AM Chesnay Schepler

Re: How to debug Metaspace exception?

2022-05-02 Thread Chesnay Schepler
There are cases where user-code is run on the JobManager. I'm not sure whether though that applies to the JDBC sources. On 02/05/2022 15:45, John Smith wrote: Why do the JDBC jars need to be on the job manager node though? On Mon, May 2, 2022 at 9:36 AM Chesnay Schepler wrote: yes.

Re: How to debug Metaspace exception?

2022-05-02 Thread John Smith
Why do the JDBC jars need to be on the job manager node though? On Mon, May 2, 2022 at 9:36 AM Chesnay Schepler wrote: > yes. > But if you can ensure that the driver isn't bundled by any user-jar you > can also skip the pattern configuration step. > > The pattern looks correct formatting-wise;

Re: How to debug Metaspace exception?

2022-05-02 Thread Chesnay Schepler
yes. But if you can ensure that the driver isn't bundled by any user-jar you can also skip the pattern configuration step. The pattern looks correct formatting-wise; you could try whether com.microsoft.sqlserver.jdbc. is enough to solve the issue. On 02/05/2022 14:41, John Smith wrote: Oh,

Re: How to debug Metaspace exception?

2022-05-02 Thread John Smith
Oh, so I should copy the jars to the lib folder and set classloader.parent-first-patterns.additional: "org.apache.ignite.;com.microsoft.sqlserver.jdbc." to both the task managers and job managers? Also is my pattern correct? "org.apache.ignite.;com.microsoft.sqlserver.jdbc." Just to be sure I'm

Re: How to debug Metaspace exception?

2022-05-02 Thread Chesnay Schepler
And you do should make sure that it is set for both processes! On 02/05/2022 08:43, Chesnay Schepler wrote: The setting itself isn't taskmanager specific; it applies to both the job- and taskmanager process. On 02/05/2022 05:29, John Smith wrote: Also just to be sure this is a Task Manager

Re: How to debug Metaspace exception?

2022-05-02 Thread Chesnay Schepler
The setting itself isn't taskmanager specific; it applies to both the job- and taskmanager process. On 02/05/2022 05:29, John Smith wrote: Also just to be sure this is a Task Manager setting right? On Thu, Apr 28, 2022 at 11:13 AM John Smith wrote: I assume you will take action on

Re: How to debug Metaspace exception?

2022-05-01 Thread John Smith
Also just to be sure this is a Task Manager setting right? On Thu, Apr 28, 2022 at 11:13 AM John Smith wrote: > I assume you will take action on your side to track and fix the doc? :) > > On Thu, Apr 28, 2022 at 11:12 AM John Smith > wrote: > >> Ok so to summarize... >> >> - Build my job jar

Re: How to debug Metaspace exception?

2022-04-28 Thread John Smith
I assume you will take action on your side to track and fix the doc? :) On Thu, Apr 28, 2022 at 11:12 AM John Smith wrote: > Ok so to summarize... > > - Build my job jar and have the JDBC driver as a compile only > dependency and copy the JDBC driver to flink lib folder. > > Or > > - Build my

Re: How to debug Metaspace exception?

2022-04-28 Thread John Smith
Ok so to summarize... - Build my job jar and have the JDBC driver as a compile only dependency and copy the JDBC driver to flink lib folder. Or - Build my job jar and include JDBC driver in the shadow, plus copy the JDBC driver in the flink lib folder, plus make an entry in config for

Re: How to debug Metaspace exception?

2022-04-28 Thread Chesnay Schepler
I think what I meant was "either add it to /lib, or [if it is already in /lib but also bundled in the jar] add it to the parent-first patterns." On 28/04/2022 15:56, Chesnay Schepler wrote: Pretty sure, even though I seemingly documented it incorrectly :) On 28/04/2022 15:49, John Smith

Re: How to debug Metaspace exception?

2022-04-28 Thread Chesnay Schepler
Pretty sure, even though I seemingly documented it incorrectly :) On 28/04/2022 15:49, John Smith wrote: You sure? * /JDBC/: JDBC drivers leak references outside the user code classloader. To ensure that these classes are only loaded once you should either add the driver jars to

Re: How to debug Metaspace exception?

2022-04-28 Thread John Smith
You sure? - *JDBC*: JDBC drivers leak references outside the user code classloader. To ensure that these classes are only loaded once you should either add the driver jars to Flink’s lib/ folder, or add the driver classes to the list of parent-first loaded class via

Re: How to debug Metaspace exception?

2022-04-27 Thread Chesnay Schepler
You're misinterpreting the docs. The parent/child-first classloading controls where Flink looks for a class /first/, specifically whether we first load from /lib or the user-jar. It does not allow you to load something from the user-jar in the parent classloader. That's just not how it works.

Re: How to debug Metaspace exception?

2022-04-26 Thread John Smith
Hi Chesnay as per the docs... https://nightlies.apache.org/flink/flink-docs-master/docs/ops/debugging/debugging_classloading/ You can either put the jars in task manager lib folder or use classloader.parent-first-patterns-additional

Re: How to debug Metaspace exception?

2022-04-26 Thread John Smith
Ok so I should put the Apache ignite and my Microsoft drivers in the lib folders of my task managers? And then in my job jar only include them as compile time dependencies? On Tue, Apr 26, 2022 at 10:42 AM Chesnay Schepler wrote: > JDBC drivers are well-known for leaking classloaders

Re: How to debug Metaspace exception?

2022-04-26 Thread Chesnay Schepler
JDBC drivers are well-known for leaking classloaders unfortunately. You have correctly identified your alternatives. You must put the jdbc driver into /lib instead. Setting only the parent-first pattern shouldn't affect anything. That is only relevant if something is in both in /lib and the

Re: How to debug Metaspace exception?

2022-04-26 Thread John Smith
So I put classloader.parent-first-patterns.additional: "org.apache.ignite." in the task config and so far I don't think I'm getting "java.lang.OutOfMemoryError: Metaspace" any more. Or it's too early to tell. Though now, the task managers are shutting down due to some other failures. So maybe

Re: How to debug Metaspace exception?

2022-04-20 Thread John Smith
Or I can put in the config to treat org.apache.ignite. classes as first class? On Tue, Apr 19, 2022 at 10:18 PM John Smith wrote: > Ok, so I loaded the dump into Eclipse Mat and followed: > https://cwiki.apache.org/confluence/display/FLINK/Debugging+ClassLoader+leaks > > - On the Histogram, I

Re: How to debug Metaspace exception?

2022-04-19 Thread John Smith
Ok, so I loaded the dump into Eclipse Mat and followed: https://cwiki.apache.org/confluence/display/FLINK/Debugging+ClassLoader+leaks - On the Histogram, I got over 30 entries for: ChildFirstClassLoader - Then I clicked on one of them "Merge Shortest Path..." and picked "Exclude all

Re: How to debug Metaspace exception?

2022-04-19 Thread Yaroslav Tkachenko
Also https://shopify.engineering/optimizing-apache-flink-applications-tips might be helpful (has a section on profiling, as well as classloading). On Tue, Apr 19, 2022 at 4:35 AM Chesnay Schepler wrote: > We have a very rough "guide" in the wiki (it's just the specific steps I > took to debug

Re: How to debug Metaspace exception?

2022-04-19 Thread Chesnay Schepler
We have a very rough "guide" in the wiki (it's just the specific steps I took to debug another leak): https://cwiki.apache.org/confluence/display/FLINK/Debugging+ClassLoader+leaks On 19/04/2022 12:01, huweihua wrote: Hi, John Sorry for the late reply. You can use MAT[1] to analyze the dump

Re: How to debug Metaspace exception?

2022-04-19 Thread huweihua
Hi, John Sorry for the late reply. You can use MAT[1] to analyze the dump file. Check whether have too many loaded classes. [1] https://www.eclipse.org/mat/ > 2022年4月18日 下午9:55,John Smith 写道: > > Hi, can anyone help with this? I never looked at a dump file before. > > On Thu, Apr 14, 2022

Re: How to debug Metaspace exception?

2022-04-18 Thread John Smith
Hi, can anyone help with this? I never looked at a dump file before. On Thu, Apr 14, 2022 at 11:59 AM John Smith wrote: > Hi, so I have a dump file. What do I look for? > > On Thu, Mar 31, 2022 at 3:28 PM John Smith wrote: > >> Ok so if there's a leak, if I manually stop the job and restart it

Re: How to debug Metaspace exception?

2022-04-14 Thread John Smith
Hi, so I have a dump file. What do I look for? On Thu, Mar 31, 2022 at 3:28 PM John Smith wrote: > Ok so if there's a leak, if I manually stop the job and restart it from > the UI multiple times, I won't see the issue because because the classes > are unloaded correctly? > > > On Thu, Mar 31,

Re: How to debug Metaspace exception?

2022-03-31 Thread John Smith
Ok so if there's a leak, if I manually stop the job and restart it from the UI multiple times, I won't see the issue because because the classes are unloaded correctly? On Thu, Mar 31, 2022 at 9:20 AM huweihua wrote: > > The difference is that manually canceling the job stops the JobMaster,

Re: How to debug Metaspace exception?

2022-03-31 Thread huweihua
The difference is that manually canceling the job stops the JobMaster, but automatic failover keeps the JobMaster running. But looking on TaskManager, it doesn't make much difference > 2022年3月31日 上午4:01,John Smith 写道: > > Also if I manually cancel and restart the same job over and over is

Re: How to debug Metaspace exception?

2022-03-30 Thread John Smith
Also if I manually cancel and restart the same job over and over is it the same as if flink was restarting a job due to failure? I.e: When I click "Cancel Job" on the UI is the job completely unloaded vs when the job scheduler restarts a job because if whatever reason? Lile this I'll stop and

Re: How to debug Metaspace exception?

2022-03-30 Thread 胡伟华
> So if I run the same jobs in my dev env will I still be able to see the > similar dump? I think running the same job in dev should be reproducible, maybe you can have a try. > If not I would have to wait at a low volume time to do it on production. > Aldo if I recall the dump is as big as

Re: How to debug Metaspace exception?

2022-03-30 Thread John Smith
I have 3 task managers (see config below). There is total of 10 jobs with 25 slots being used. The jobs are 100% ETL I.e; They load Json, transform it and push it to JDBC, only 1 job of the 10 is pushing to Apache Ignite cluster. FOR JMAP. I know that it will pause the task manager. So if I run

Re: How to debug Metaspace exception?

2022-03-30 Thread 胡伟华
Hi, John Could you tell us you application scenario? Is it a flink session cluster with a lot of jobs? Maybe you can try to dump the memory with jmap and use tools such as MAT to analyze whether there are abnormal classes and classloaders > 2022年3月30日 上午6:09,John Smith 写道: > > Hi running