Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Yangze Guo
Great work! Congratulations to everyone involved! Best, Yangze Guo On Fri, Oct 27, 2023 at 10:23 AM Qingsheng Ren wrote: > > Congratulations and big THANK YOU to everyone helping with this release! > > Best, > Qingsheng > > On Fri, Oct 27, 2023 at 10:18 AM Benchao Li wr

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Yangze Guo
Great work! Congratulations to everyone involved! Best, Yangze Guo On Fri, Oct 27, 2023 at 10:23 AM Qingsheng Ren wrote: > > Congratulations and big THANK YOU to everyone helping with this release! > > Best, > Qingsheng > > On Fri, Oct 27, 2023 at 10:18 AM Benchao Li wr

Re: Query on class annotation like @Experimental

2023-05-03 Thread Yangze Guo
. They are stable across patch releases (1.17.0 and 1.17.1), but can be changed across minor releases (1.17.0 and 1.18.0). You can refer to [1] for more details. [1] https://cwiki.apache.org/confluence/display/FLINK/FLIP-196%3A+Source+API+stability+guarantees Best, Yangze Guo On Wed, May 3

Re: Flink资源动态分配

2021-11-14 Thread Yangze Guo
可以尝试下Reactive mode.[1] 扩缩容操作仍需要外部干预。 [1] https://nightlies.apache.org/flink/flink-docs-master/zh/docs/deployment/elastic_scaling/ Best, Yangze Guo On Mon, Nov 15, 2021 at 3:41 PM 疾鹰击皓月 <1764232...@qq.com.invalid> wrote: > > 您好: > > Flink工程白天和晚上数据流量差距非常巨大,如果并行度设置的低无法应对白天的流量

Re: How to specify slot task sharing group for a task manager?

2021-11-12 Thread Yangze Guo
AFAIK, it is not under discussion now. Best, Yangze Guo On Fri, Nov 12, 2021 at 9:16 PM Morten Gunnar Bjørner Lindeberg wrote: > > Hi again > > Ok, thanks then I understand:) Do you know if there is a plan to support this > with elastic scaling in a later release? >

Re: How to specify slot task sharing group for a task manager?

2021-11-11 Thread Yangze Guo
want some of the TaskManagers to have GPU, you may set up a standalone cluster and then manually start a TaskManager with GPU and let it register to the cluster. Best, Yangze Guo On Thu, Nov 11, 2021 at 9:57 PM Chesnay Schepler wrote: > > The external resource documentation should c

Re: NoResourceAvailableException on taskmanager(s)

2021-11-04 Thread Yangze Guo
s how many slots your job needs. Best, Yangze Guo On Thu, Nov 4, 2021 at 5:58 PM Deniz Koçak wrote: > > Hi, > > We have been running our job on flink image > 1.13.2-stream1-scala_2.12-java11. It's a standalone deployment on a > Kubernetes cluster (EKS on AWS which uses EC2 nodes

Re: flink 1.13.1 通过yarn-application运行批应用,处理mysql源一亿条数据到hive,发现需要配置16G+的Taskmangaer内存

2021-11-04 Thread Yangze Guo
失败的原因呢?有没有报错栈和日志? Best, Yangze Guo On Thu, Nov 4, 2021 at 4:01 PM Asahi Lee <978466...@qq.com.invalid> wrote: > > hi! > 我通过flink sql,将mysql的一亿条数据传输到hive库中,通过yarn-application方式运行,结果配置16G的内存,执行失败!

Re: [ANNOUNCE] Apache Flink 1.13.3 released

2021-10-22 Thread Yangze Guo
Thank Chesnay, Martijn, and everyone involved! Best, Yangze Guo On Fri, Oct 22, 2021 at 4:25 PM Yun Tang wrote: > > Thanks for Chesnay & Martijn and everyone who made this release happen. > > Best > Yun Tang > > From: JING ZHANG > Se

Re: High availability data clean up

2021-10-21 Thread Yangze Guo
/native_kubernetes/#stop-a-running-session-cluster Best, Yangze Guo On Thu, Oct 21, 2021 at 6:37 AM Weiqing Yang wrote: > > > Hi, > > Per the doc, `kubernetes.jobmanager.owner.reference` can be used to set up > the owners of the job manager Deployment. If the owner is deleted, then

Re: Programmatically configuring S3 settings

2021-10-17 Thread Yangze Guo
Hi, Pavel. >From my understanding of the doc[1], you need to set it in flink-conf.yaml instead of your job. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#hadooppresto-s3-file-systems-plugins Best, Yangze Guo On Sat, Oct 16, 2021 at 5:46 AM Pa

Re: Yarn job not exit when flink job exit

2021-10-11 Thread Yangze Guo
d you like to take a look at [2]? [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/yarn/#per-job-cluster-mode [2] https://ci.apache.org/projects/flink/flink-docs-master/docs/dev/table/sqlclient/ Best, Yangze Guo On Tue, Oct 12, 2021 at 10:52 AM C

Re: [ANNOUNCE] Apache Flink 1.14.0 released

2021-09-29 Thread Yangze Guo
Thanks, Xintong, Joe, Dawid for the great work, thanks to everyone involved! Best, Yangze Guo On Thu, Sep 30, 2021 at 12:02 AM Rion Williams wrote: > > Great news all! Looking forward to it! > > > On Sep 29, 2021, at 10:43 AM, Theo Diefenthal > > wrote: > >

Re: Flink run different jars

2021-09-29 Thread Yangze Guo
/deployment/resource-providers/standalone/overview/#starting-a-standalone-cluster-session-mode Best, Yangze Guo On Wed, Sep 29, 2021 at 1:02 PM Qihua Yang wrote: > > Hi Yangze, > > Thanks a lot for your reply. References are very helpful! > Another quick question. Reference 1 can sta

Re: Flink run different jars

2021-09-28 Thread Yangze Guo
-providers/native_kubernetes/#session-mode [3] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/yarn/#session-mode Best, Yangze Guo On Wed, Sep 29, 2021 at 5:57 AM Qihua Yang wrote: > > Hi, > > Is that possible to run a flink app without a job? What

Re: S3 access permission error

2021-09-22 Thread Yangze Guo
I'm not an expert on S3. If it is not a credential issue, have you finish the checklist of this doc[1]? [1] https://aws.amazon.com/premiumsupport/knowledge-center/emr-s3-403-access-denied/?nc1=h_ls Best, Yangze Guo On Wed, Sep 22, 2021 at 3:39 PM Dhiru wrote: > > > Not sur

Re: S3 access permission error

2021-09-22 Thread Yangze Guo
You might need to configure the access credential. [1] [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/filesystems/s3/#configure-access-credentials Best, Yangze Guo On Wed, Sep 22, 2021 at 2:17 PM Dhiru wrote: > > > i see org.apache.hadoop.fs.FileSyst

Re: Invalid flink-config keeps going and ignores bad config values

2021-09-17 Thread Yangze Guo
AFAIK there is not. Flink will just skip the invalid lines. Best, Yangze Guo On Sat, Sep 18, 2021 at 7:00 AM Dan Hill wrote: > > Hi. I noticed my flink-config.yaml had an error in it. I assumed a bad > config would stop Flink from running (to catch errors earlier). Is there a &g

Re: flink on native kubernetes作业cpu实际使用量与请求量相差太大问题

2021-09-16 Thread Yangze Guo
可能是cpu过少导致tm启动慢,超过其注册限定时间,可以尝试调大resourcemanager.taskmanager-registration.timeout Best, Yangze Guo On Fri, Sep 17, 2021 at 9:36 AM casel.chen wrote: > > 我们使用Flink运行实时作业在Kubernetes,发现作业实际使用的CPU资源远远小于作业请求量,但是将作业请求量降低后发现作业启动不了。请问这是个案还是正常情况? > 例如,我们一个作业请求了0.5个cpu,但实际使用量只有0.09左右,修改请求为0.2个c

Re: Tracking Total Metrics Reported

2021-09-15 Thread Yangze Guo
Hi, Mason AFAIK the JM does not report the total number of metrics it has. Maybe you can stats it of each entity through [1]? [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/ops/metrics/#rest-api-integration Best, Yangze Guo On Thu, Sep 16, 2021 at 9:30 AM Mason Chen wrote

Re: Streaming SQL support for redis streaming connector

2021-09-14 Thread Yangze Guo
ps://github.com/apache/bahir-flink/blob/master/flink-connector-redis/src/main/java/org/apache/flink/streaming/connectors/redis/descriptor/RedisValidator.java Best, Yangze Guo On Wed, Sep 15, 2021 at 11:34 AM Osada Paranaliyanage wrote: > > Hi David, > > > > What abo

Re: Job manager crash

2021-09-05 Thread Yangze Guo
Job) ? I would also pull Yang Wang. Best, Yangze Guo On Mon, Sep 6, 2021 at 10:10 AM Caizhi Weng wrote: > > Hi! > > There is a message saying "java.lang.NoClassDefFound Error: > org/apache/hadoop/hdfs/HdfsConfiguration" in your log file. Are you visiting > H

Re: Flink taskmanager in crash loop

2021-08-17 Thread Yangze Guo
CANCELING" in jm and tm logs. Best, Yangze Guo On Wed, Aug 18, 2021 at 1:20 AM Abhishek Rai wrote: > > Before these message, there is the following message in the log: > > 2021-08-12 23:02:58.015 [Canceler/Interrupts for Source: MASKED]) > (1/1)#29103' did not react to can

Re: redis sink from flink

2021-08-16 Thread Yangze Guo
Hi, Jin IIUC, the DataStream connector `RedisSink` can still be used. However, the Table API connector `RedisTableSink` might not work (at least in the future) because it is implemented based on the deprecated Table connector abstraction. You can still give it a try, though. Best, Yangze Guo

Re: Flink taskmanager in crash loop

2021-08-16 Thread Yangze Guo
Hi, Abhishek, Do you see something like "Fatal error occurred while executing the TaskManager" in your log or would you like to provide the whole task manager log? Best, Yangze Guo On Tue, Aug 17, 2021 at 5:17 AM Abhishek Rai wrote: > > Hello, > > In our production envi

Re: taskmanager数问题请教

2021-08-05 Thread Yangze Guo
现在yarn上都是按需申请的 Best, Yangze Guo On Fri, Aug 6, 2021 at 10:31 AM 上官 wrote: > > 1.13版本中yarn模式提交时 -yn好像不能用了,请问现在要如何指定容器(taskmanager)数量?

Re: Obtain JobManager Web Interface URL

2021-08-02 Thread Yangze Guo
>From my understanding, what you want is actually a management system for Flink jobs. I think it might be good to submit the job(with `flink run`) and retrieve the WebUI in another process. Best, Yangze Guo On Mon, Aug 2, 2021 at 10:39 PM Hailu, Andreas [Engineering] wrote: > > Hi Yan

Re: Obtain JobManager Web Interface URL

2021-08-01 Thread Yangze Guo
AFAIK, the ClusterClient should not be exposed through the public API. Would you like to explain your use case and why you need to get the web UI programmatically? Best, Yangze Guo On Fri, Jul 30, 2021 at 9:54 PM Hailu, Andreas [Engineering] wrote: > > Hello Yangze, thanks for resp

Re: Obtain JobManager Web Interface URL

2021-07-29 Thread Yangze Guo
/flink/flink-docs-master/docs/deployment/resource-providers/standalone/overview/#starting-and-stopping-a-cluster Best, Yangze Guo On Fri, Jul 30, 2021 at 1:41 AM Hailu, Andreas [Engineering] wrote: > > Hi team, > > > > Is there a method available to obtain the JobManager’s REST url

Re: TaskManager crash after cancelling a job

2021-07-28 Thread Yangze Guo
In your case, the entry point is the `cleanUpInvoke` function called by `StreamTask#invoke`. @ro...@apache.org Could you take another look at this? Best, Yangze Guo On Thu, Jul 29, 2021 at 2:29 AM Ivan Yang wrote: > > Hi Yangze, > > I deployed 1.13.1, same problem exists. I

Re: TaskManager crash after cancelling a job

2021-07-26 Thread Yangze Guo
Hi, Ivan My gut feeling is that it is related to FLINK-22535. Could @Yun Gao take another look? If that is the case, you can upgrade to 1.13.1. Best, Yangze Guo On Tue, Jul 27, 2021 at 9:41 AM Ivan Yang wrote: > > Dear Flink experts, > > We recently ran into an issue during a job

Re: kerberos token expire

2021-07-05 Thread Yangze Guo
The ticket cache will be expired after its lifespan. You can try to set the security.kerberos.login.use-ticket-cache to false as you provide the keytab. Best, Yangze Guo On Tue, Jul 6, 2021 at 10:02 AM 谢扬成 wrote: > > Hi, > > I processed data with flink which version is 1.12.2, data

Re: Yarn doesn't deploy multple TMs; -yn option missing in newer versions

2021-07-05 Thread Yangze Guo
a replacement but the progress has not been started yet. As a workaround for your testing purpose, you can submit a warmup job(e.g. WordCount with required parallelism) and increase the "slotmanager.taskmanager-timeout" to ensure the TM will not timeout fast. [1] http://issues.apache.

Re: specify number of TM; how stream app use state of batch app; orc / parquet file format have different impact on tpcds performance benchmark.

2021-07-01 Thread Yangze Guo
Please refer to taskmanager.numberOfTaskSlots [1]. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/config/#taskmanager-numberoftaskslots Best, Yangze Guo Best, Yangze Guo On Thu, Jul 1, 2021 at 3:57 PM vtygoss wrote: > > Hi, > > > i have some questio

Re: FW: Hadoop3 with Flink

2021-06-28 Thread Yangze Guo
Sorry for the belated reply. In 1.12, you just need to make sure that the HADOOP_CLASSPATH environment variable is set up. For more details, please refer to [1]. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/yarn/ Best, Yangze Guo On Mon, Jun 28

Re: How to set state.backend.rocksdb.latency-track-enabled

2021-06-18 Thread Yangze Guo
e-backends-latency-tracking-options Best, Yangze Guo On Fri, Jun 18, 2021 at 3:39 PM Chen-Che Huang wrote: > > Hi, > > The 1.13 release note > (https://flink.apache.org/news/2021/05/03/release-1.13.0.html) mentions that > we can set state.backend.rocksdb.latency-track-enabled to o

Re: after upgrade flink1.12 to flink1.13.1, flink web-ui's taskmanager detail page error

2021-06-17 Thread Yangze Guo
Thanks for the report, Yidan. It will be fixed in FLINK-23024 and hopefully fixed in 1.13.2. Best, Yangze Guo On Fri, Jun 18, 2021 at 10:00 AM yidan zhao wrote: > > Yeah, I also think it is a bug. > > Arvid Heise 于2021年6月17日周四 下午10:13写道: > > > > Hi Yidan,

Re: Elasticsearch sink connector timeout

2021-06-06 Thread Yangze Guo
the connection/socket timeout in Elasticsearch SQL connector. However, if the root cause is a network jitter, you may increase the sink.bulk-flush.backoff.delay and the sink.bulk-flush.backoff.max-retries. Best, Yangze Guo On Sat, Jun 5, 2021 at 2:28 PM Kai Fu wrote: > > With some investi

Re: Flink app performance test framework

2021-06-06 Thread Yangze Guo
. [1] https://github.com/nexmark/nexmark Best, Yangze Guo On Sun, Jun 6, 2021 at 7:35 AM luck li wrote: > > Hi flink community, > > Is there any test framework that we can use to test flink jobs performance? > We would like to automate process for regression tests during flink ver

Re: [ANNOUNCE] Apache Flink 1.13.1 released

2021-05-31 Thread Yangze Guo
Thanks, Dawid for the great work, thanks to everyone involved. Best, Yangze Guo On Mon, May 31, 2021 at 4:14 PM Youngwoo Kim (김영우) wrote: > > Got it. > Thanks Dawid for the clarification. > > - Youngwoo > > On Mon, May 31, 2021 at 4:50 PM Dawid Wysakowicz >

Re: Heartbeat Timeout

2021-05-27 Thread Yangze Guo
Hi, Rober, To mitigate this issue, you can increase the "heartbeat.interval" and "heartbeat.timeout". However, I think we should first figure out the root cause, would you like to provide the log of 10.42.0.49:6122-e26293? Best, Yangze Guo On Thu, May 27, 2021 at 10:44 PM

Re: DataStream API in Batch Mode job is timing out, please advise on how to adjust the parameters.

2021-05-25 Thread Yangze Guo
Hi, Marco, The root cause is NoResourceAvailableException. Could you provide the following information? - How many slots each TM has? - Your job's topology, it would also be good to share the job manager log. Best, Yangze Guo On Tue, May 25, 2021 at 12:10 PM Marco Villalobos wrote: > >

Re: ES sink never receive error code

2021-05-24 Thread Yangze Guo
Jacky is right. It's a known issue and will be fixed in FLINK-21511. Best, Yangze Guo On Tue, May 25, 2021 at 8:40 AM Jacky Yin 殷传旺 wrote: > > If you are using es connector 6.*, actually there is a deadlock bug if the > backoff is enabled. The 'retry' and 'flush' share one thread p

Re: Issues with forwarding environment variables

2021-05-20 Thread Yangze Guo
the system properties at the TaskManager will be used. If that is the case, you can initiate the `serviceName` field in the map/flatMap or open function. Then, it will read the TM's envs or properties instead. Best, Yangze Guo On Fri, May 21, 2021 at 5:40 AM Milind Vaidya wrote: > > This is jav

Re: ES sink never receive error code

2021-05-20 Thread Yangze Guo
ill still be called. Best, Yangze Guo On Fri, May 21, 2021 at 5:53 AM Qihua Yang wrote: > > Thank you for the reply! > Yes, we did config bulk.flush.backoff.enable. > So, ES BulkProcessor retried after bulk request was partially rejected. And > eventually that request was sent s

Re: Root Exception can not be shown on Web UI in Flink 1.13.0

2021-05-12 Thread Yangze Guo
Hi, it seems to be related to FLINK-22276. Thus, I'd involve Matthias to take a look. @Matthias My gut feeling is that not all execution who has failureInfo has been deployed? Best, Yangze Guo On Wed, May 12, 2021 at 10:12 PM Gary Wu wrote: > > Hi, > > We have upgraded our Flink

Re: Customized Metric Reporter can not be found by Flink

2021-05-11 Thread Yangze Guo
Hi, Fan Flink loaded the custom reporter through the service loader mechanism.[1] Do you add the service file in the "resources/META-INF/services" directory? [1] https://docs.oracle.com/javase/9/docs/api/java/util/ServiceLoader.html Best, Yangze Guo On Wed, May 12, 2021 at 7:53

Re: Session mode on Kubernetes and # of TMs

2021-05-11 Thread Yangze Guo
59, we will introduce the min number of slots of the cluster. With this feature, you can configure how many TMs needed before submitting the jobs. [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/kubernetes/ Best, Yangze Guo On Tue, May 11, 2021 at 12

Re: 问题咨询

2021-05-09 Thread Yangze Guo
Hi, 看日志application已经提交到yarn了,但是am没有调度起来,看一下Yarn界面application_1620481460888_0003 这个application的状态是什么?有没有报错 Best, Yangze Guo On Mon, May 10, 2021 at 9:41 AM wyinj...@126.com wrote: > > 您好,在使用flink on YARN的时候遇到了问题,寻求帮助,YARN集群搭建完成之后,运行flink命令如下: > ./bin/flink run -t yarn-per-job -

Re: How to increase the number of task managers?

2021-05-07 Thread Yangze Guo
/docs/deployment/resource-providers/yarn/ [3] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/resource-providers/standalone/overview/#starting-and-stopping-a-cluster Best, Yangze Guo Best, Yangze Guo On Fri, May 7, 2021 at 7:34 PM Tamir Sagi wrote: > > Hey > &

Re: Enabling Checkpointing using FsStatebackend

2021-05-07 Thread Yangze Guo
Hi, I think the checkpointing is not the root cause of your job failure. As the log describes, your job failed caused by the authorization issue of Kafka. "Caused by: org.apache.kafka.common.errors.TransactionalIdAuthorizationException: Transactional Id authorization failed." Best,

Re: [ANNOUNCE] Apache Flink 1.13.0 released

2021-05-07 Thread Yangze Guo
Thanks, Dawid & Guowei for the great work, thanks to everyone involved. Best, Yangze Guo On Thu, May 6, 2021 at 5:51 PM Rui Li wrote: > > Thanks to Dawid and Guowei for the great work! > > On Thu, May 6, 2021 at 4:48 PM Zhu Zhu wrote: >> >> Thanks Dawid and

Re: Deployment/Memory Configuration/Scalability

2021-04-26 Thread Yangze Guo
ache.org/projects/flink/flink-docs-stable/ops/state/checkpoints.html [3] https://ci.apache.org/projects/flink/flink-docs-release-1.12/deployment/resource-providers/native_kubernetes.html [4] https://issues.apache.org/jira/browse/FLINK-17709 [5] https://ci.apache.org/projects/flink/flink-docs-relea

Re: when should `FlinkYarnSessionCli` be included for parsing CLI arguments?

2021-04-26 Thread Yangze Guo
If the GenericCLI is selected, then the execution.target should have been overwritten to "yarn-application" in GenericCLI#toConfiguration. It is odd that why the GenericCLI#isActive return false as the execution.target is defined in both flink-conf and command line. Best, Yangze Guo O

Re: when should `FlinkYarnSessionCli` be included for parsing CLI arguments?

2021-04-26 Thread Yangze Guo
, Yangze Guo On Mon, Apr 26, 2021 at 4:48 PM Till Rohrmann wrote: > > Hi Tony, > > I think you are right that Flink's cli does not behave super consistent at > the moment. Case 2. should definitely work because `-t yarn-application` > should overwrite what is defined in the F

Re: Kubernetes Setup - JM as job vs JM as deployment

2021-04-26 Thread Yangze Guo
/kubernetes.html Best, Yangze Guo On Thu, Apr 22, 2021 at 10:46 PM Matthias Pohl wrote: > > Hi Gil, > I'm not sure whether I understand you correctly. What do you mean by > deploying the job manager as "job" or "deployment"? Are you referring to the > differen

Re: when should `FlinkYarnSessionCli` be included for parsing CLI arguments?

2021-04-26 Thread Yangze Guo
Hi, Tony. What is the version of your flink-dist. AFAIK, this issue should be addressed in FLINK-15852[1]. Could you give the client log of case 2(set the log level to DEBUG would be better). [1] https://issues.apache.org/jira/browse/FLINK-15852 Best, Yangze Guo On Sun, Apr 25, 2021 at 11:33

Re: Receiving context information through JobListener interface

2021-04-25 Thread Yangze Guo
-docs-master/docs/ops/rest_api/#jobs-overview Best, Yangze Guo On Sun, Apr 25, 2021 at 4:17 PM Barak Ben Nathan wrote: > > > > Hi all, > > > > I am building an application that launches Flink Jobs and monitors them. > > > > I want to use the JobListener inter

Re: flink1.12.1 Sink数据到ES7,遇到 Invalid lambda deserialization 问题

2021-04-15 Thread Yangze Guo
可以参考下[1], 如果是相同的问题,将依赖改为flink-connector-elasticsearch [1] https://issues.apache.org/jira/browse/FLINK-18857 Best, Yangze Guo On Fri, Apr 16, 2021 at 10:43 AM Yangze Guo wrote: > > 有完整报错栈或者日志能发下么? > > Best, > Yangze Guo > > On Fri, Apr 16, 2021 at 9:33 AM william <

Re: flink1.12.1 Sink数据到ES7,遇到 Invalid lambda deserialization 问题

2021-04-15 Thread Yangze Guo
有完整报错栈或者日志能发下么? Best, Yangze Guo On Fri, Apr 16, 2021 at 9:33 AM william <712677...@qq.com> wrote: > > > > > > -- > Sent from: http://apache-flink.147419.n8.nabble.com/

Re: Does it support gpu coding in flink?

2021-04-12 Thread Yangze Guo
/src/main/java/com/alibaba/flink/ml/examples/tensorflow/mnist Best, Yangze Guo On Mon, Apr 12, 2021 at 2:35 PM 张颖 wrote: > > HI,I am running a tf inference task on my cluster,but I flind it took so long > a time to get response, becase it is a bert model and I run it on cpu >

Re: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-08 Thread Yangze Guo
IIUC, your program will finally generate 100 ChildFirstClassLoader in a TM. But it should always be GC when job finished. So, as Arvid said, you'd better check who is referencing those ChildFirstClassLoader. Best, Yangze Guo On Thu, Apr 8, 2021 at 5:43 PM 太平洋 <495635...@qq.com> wrote:

Re: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-07 Thread Yangze Guo
I went through the JM & TM logs but could not find any valuable clue. The exception is actually thrown by kafka-producer-network-thread. Maybe @Qingsheng could also take a look? Best, Yangze Guo On Thu, Apr 8, 2021 at 10:39 AM 太平洋 <495635...@qq.com> wrote: > > I have co

Re: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-06 Thread Yangze Guo
me TM will share the same classloader. The classloader will be removed if there is no more task running on the TM. Classloader without reference will be finally cleanup by GC. Could you share JM and TM logs for further analysis? I'll also involve @Guowei Ma in this thread. Best, Yangze Guo On Tue,

Re: period batch job lead to OutOfMemoryError: Metaspace problem

2021-04-06 Thread Yangze Guo
I think you can try to increase the JVM metaspace option for TaskManagers through taskmanager.memory.jvm-metaspace.size. [1] [1] https://ci.apache.org/projects/flink/flink-docs-master/docs/deployment/memory/mem_trouble/#outofmemoryerror-metaspace Best, Yangze Guo Best, Yangze Guo On Tue, Apr

Re: [BULK]Re: [SURVEY] Remove Mesos support

2021-03-28 Thread Yangze Guo
+1 Best, Yangze Guo On Mon, Mar 29, 2021 at 11:31 AM Xintong Song wrote: > > +1 > It's already a matter of fact for a while that we no longer port new features > to the Mesos deployment. > > Thank you~ > > Xintong Song > > > > On Fri, Mar 26, 2021 at 10:

Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Yangze Guo
Thanks Xintong for the great work! Best, Yangze Guo On Tue, Jan 19, 2021 at 4:47 PM Till Rohrmann wrote: > > Thanks a lot for driving this release Xintong. This was indeed a release with > some obstacles to overcome and you did it very well! > > Cheers, > Till > > On T

Re: [ANNOUNCE] Apache Flink 1.12.1 released

2021-01-19 Thread Yangze Guo
Thanks Xintong for the great work! Best, Yangze Guo On Tue, Jan 19, 2021 at 4:47 PM Till Rohrmann wrote: > > Thanks a lot for driving this release Xintong. This was indeed a release with > some obstacles to overcome and you did it very well! > > Cheers, > Till > > On T

Re: yarn Per-Job Cluster Mode提交任务时 通过cli指定的内存参数无效

2021-01-17 Thread Yangze Guo
请问这个路径是你本地的路径么?需要client端能根据这个路径找到jar包 Best, Yangze Guo On Mon, Jan 18, 2021 at 10:34 AM 刘海 wrote: > > 你好 > 根据你的建议我试了一下 > 将提交命令改为: ./bin/flink run -d -t yarn-per-job -tm 1536 -jm 3072 -D > jobmanager.memory.process.size=1.5GB -D taskmanager.memory.process.size=3GB > -D

Re: Monitor the Flink

2021-01-17 Thread Yangze Guo
, Yangze Guo On Sun, Jan 17, 2021 at 6:43 PM penguin. wrote: > > Hello, > > > In the Flink cluster, > > How to monitor each taskslot of taskmanager? For example, the CPU and memory > usage of each slot and the traffic between slots. > > What is the way to get the tr

Re: Number of parallel connections for Elasticsearch Connector

2021-01-17 Thread Yangze Guo
sks will be placed into different slots and have their own Elasticsearch sink instance. [1] https://github.com/apache/flink/blob/master/flink-connectors/flink-connector-elasticsearch-base/src/main/java/org/apache/flink/streaming/connectors/elasticsearch/ElasticsearchSinkBase.java#L204. Best, Yangze

Re: yarn Per-Job Cluster Mode提交任务时 通过cli指定的内存参数无效

2021-01-17 Thread Yangze Guo
Hi, 请使用 -D -tm -jm 不需要加y前缀 Best, Yangze Guo Best, Yangze Guo On Mon, Jan 18, 2021 at 9:19 AM 刘海 wrote: > > > 刘海 > liuha...@163.com > 签名由 网易邮箱大师 定制 > 在2021年1月18日 09:15,刘海 写道: > > Hi Dear All, >请教各位一个问题,下面是我的集群配置: > 1、我现在使用的是flink1.12版本; > 2、基于CDH6.3.2搭建的

Re: Main class logs in Yarn Mode

2021-01-12 Thread Yangze Guo
I think you can try the application mode[1]. [1] https://ci.apache.org/projects/flink/flink-docs-master/deployment/#application-mode Best, Yangze Guo On Tue, Jan 12, 2021 at 5:23 PM bat man wrote: > > Thanks Yangze Gua. > Is there a way these can be redirected to a yarn logs. >

Re: Main class logs in Yarn Mode

2021-01-12 Thread Yangze Guo
The main function of your WordCountExample is executed in your local environment. So, the logs you are looking for ("Entering application.") are be located in your console output and the "log/" directory of your Flink distribution. Best, Yangze Guo On Tue, Jan 12, 2021 at 4:

Re: task manager内存使用问题

2020-12-17 Thread Yangze Guo
1. 加jvm参数可以使用env.java.opts.taskmanager配置 2. 目前tm中没有对heap memory进行slot间细粒度管理,session模式下不支持这种功能 Best, Yangze Guo On Fri, Dec 18, 2020 at 9:22 AM guoliubi...@foxmail.com wrote: > > Hi, > 现在使用的是flink1.12,使用standalone cluster模式运行。 > 在上面运行一个Job内存消耗大,会用满heap然后把整个task manager带崩掉。 > 想问下怎

Re: flink sql 1.12 写数据到elasticsearch,部署问题

2020-12-15 Thread Yangze Guo
需要放 flink-sql-connector-elasticsearch7_2.11-1.12.0.jar Best, Yangze Guo On Wed, Dec 16, 2020 at 11:34 AM cljb...@163.com wrote: > > hi, > flink sql 1.12版本,写数据到elasticsearch时,本地执行正常,部署到服务器上,报如下错误。 > 检查了打的jar包,里面是包含相应的类的,在flink > lib下也已经放了flink-connector-elasticsearch7_2.

Re: flink1.11 datastream elasticsearch sink 写入es需要账号密码验证,目前支持这种操作吗?

2020-12-15 Thread Yangze Guo
1.11版本中尚不支持username和password的设置,这两个配置在1.12中加入了新的es connector[1] [1] https://issues.apache.org/jira/browse/FLINK-18361 Best, Yangze Guo On Wed, Dec 16, 2020 at 11:34 AM 李世钰 wrote: > > flink1.11 datastream elasticsearch sink 写入es需要账号密码验证,目前支持这种操作吗? > elastic

Re: 使用RedisSink无法将读取的Kafka数据写入Redis中

2020-12-06 Thread Yangze Guo
大概率是网络不通,可以检查一下白名单设置 Best, Yangze Guo On Mon, Dec 7, 2020 at 10:28 AM Jark Wu wrote: > > 这个估计和网络和部署有关,建议咨询下华为云的技术支持。 > > On Sun, 6 Dec 2020 at 20:40, 赵一旦 wrote: > > > 连接不上,你的华为云确认和redis服务器连通吗? > > > > 追梦的废柴 于2020年12月6日周日 下午8:35写道: > > > > > 各位

Re: taskmanager.cpu.cores 1.7976931348623157E308

2020-12-06 Thread Yangze Guo
My gut feeling is your "vmArgs" does not take effect. Best, Yangze Guo On Mon, Dec 7, 2020 at 10:32 AM Yangze Guo wrote: > > Hi, Rex, > > Can you share more logs for it. Did you see something like "The > configuration option taskmanager.cpu.cores required fo

Re: taskmanager.cpu.cores 1.7976931348623157E308

2020-12-06 Thread Yangze Guo
Hi, Rex, Can you share more logs for it. Did you see something like "The configuration option taskmanager.cpu.cores required for local execution is not set, setting it to" in your logs? Best, Yangze Guo Best, Yangze Guo On Sat, Dec 5, 2020 at 6:53 PM David Ander

Re: Flink on YARN: delegation token expired prevent job restart

2020-11-17 Thread Yangze Guo
Hi, There is a login operation in YarnEntrypointUtils.logYarnEnvironmentInformation without the keytab. One suspect is that Flink may access the HDFS when it tries to build the PackagedProgram. Does this issue only happen in the application mode? If so, I would cc @kkloudas. Best, Yangze Guo

Re: Flink on YARN: delegation token expired prevent job restart

2020-11-17 Thread Yangze Guo
Hi, AFAIK, Flink does exclude the HDFS_DELEGATION_TOKEN in the HadoopModule when user provides the keytab and principal. I'll try to do a deeper investigation to figure out is there any HDFS access before the HadoopModule installed. Best, Yangze Guo On Tue, Nov 17, 2020 at 4:36 PM Kien Truong

Re: Flink on YARN: delegation token expired prevent job restart

2020-11-17 Thread Yangze Guo
Hi, Kien, Do you config the "security.kerberos.login.principal" and the "security.kerberos.login.keytab" together? If you only set the keytab, it will not take effect. Best, Yangze Guo On Tue, Nov 17, 2020 at 3:03 PM Kien Truong wrote: > > Hi all, > > We

Re: Re: flink tm cpu cores设置

2020-11-10 Thread Yangze Guo
你这个少了一个"v", 应该是yarn.containers.vcores Best, Yangze Guo On Tue, Nov 10, 2020 at 3:43 PM zjfpla...@hotmail.com wrote: > > JM logs里面有 > Loading configuration property: yarn.containers.cores,4 > > > > zjfpla...@hotmail.com > > 发件人: zjfpla...@hotmail.com > 发送

Re: Re: flink tm cpu cores设置

2020-11-08 Thread Yangze Guo
如何确认没有用的呢?能分享一下jm日志么? 另外这个参数实际是否生效也取决于yarn的调度器是否开启了cpu调度 Best, Yangze Guo On Thu, Nov 5, 2020 at 1:50 PM zjfpla...@hotmail.com wrote: > > 这个再flink-conf.yaml中设置过没用 > > > > zjfpla...@hotmail.com > > 发件人: JasonLee > 发送时间: 2020-11-05 13:49 > 收件人: user-zh > 主题:

Re: flink-1.11.2提交到yarn一直处于CREATED中

2020-11-03 Thread Yangze Guo
有更完整的am日志么?需要看一下rm那边资源申请情况。 Best, Yangze Guo On Wed, Nov 4, 2020 at 11:45 AM 酷酷的浑蛋 wrote: > > > > 下面是报错,说是没有资源,但资源是充足的,之后我把版本改为1.11.1,任务就可以运行了 > org.apache.flink.runtime.jobmanager.scheduler.NoResourceAvailableException: > Could not allocate the required slot within sl

Re: yarn部署模式kerberos问题

2020-11-03 Thread Yangze Guo
apache.org/projects/flink/flink-docs-master/ops/security-kerberos.html Best, Yangze Guo On Tue, Nov 3, 2020 at 4:17 PM amen...@163.com wrote: > > hi everyone, > > 最近使用flink-1.11.1在通过per-job方式提交任务到yarn队列的时候,碰到了kerberos权限认证问题。 > > 具体描述:在客户端通过Kerberos权限认证指定用户,提交flink任务到yarn队列,正常提交,但是

Re: Increase in parallelism has very bad impact on performance

2020-11-02 Thread Yangze Guo
Hi, Sidney, What is the data generation rate of your Kafka topic? Is it a lot bigger than 6000? Best, Yangze Guo Best, Yangze Guo On Tue, Nov 3, 2020 at 8:45 AM Sidney Feiner wrote: > > Hey, > I'm writing a Flink app that does some transformation on an event consumed >

Re: flink1.11 elasticsearch connector

2020-10-29 Thread Yangze Guo
1.11版本中尚不支持username和password的设置,这两个配置在1.12中加入了新的es connector[1] [1] https://issues.apache.org/jira/browse/FLINK-18361 Best, Yangze Guo On Thu, Oct 29, 2020 at 3:37 PM 赵帅 wrote: > > elasticsearch7.6有账号认证,目前flink1.11 elasticsearch connector sql api如何加入账号认证?

Re: No pooled slot available and request to ResourceManager for new slot failed

2020-10-28 Thread Yangze Guo
Hi, 你job的并发是多少?一共请求了多少个slot? 方便的话最好发一下jm的日志来帮助排查 Best, Yangze Guo On Thu, Oct 29, 2020 at 10:07 AM marble.zh...@coinflex.com.INVALID wrote: > > 大家好。 > > 只有一个job,设置了jm/tm各总内存为3G,一个taskmanager,总共10个slot,为什么还是报这个错

Re: pyflink1.11.0 如果elasticsearch host有访问权限,connector如何写入用户名密码

2020-10-22 Thread Yangze Guo
1.11版本中尚不支持username和password的设置,这两个配置在1.12中加入了新的es connector[1] [1] https://issues.apache.org/jira/browse/FLINK-18361 Best, Yangze Guo On Thu, Oct 22, 2020 at 3:47 PM whh_960101 wrote: > > Hi,各位大佬们:如果要sink数据到elasticsearch host有访问权限,elasticsearch > connector如何写入用户名密码我按照官网里的样例格式来写的,没有找

Re: flink slot资源隔离

2020-10-20 Thread Yangze Guo
Managed Memory是隔离的,Heap,Network都是TM级别共享 Best, Yangze Guo On Wed, Oct 21, 2020 at 10:06 AM 赵一旦 wrote: > > flink slot的资源隔离,内存会真实隔离嘛?cpu肯定不会。

Re: Flink multiple task managers setup

2020-09-21 Thread Yangze Guo
Hi, As the error message said, it could not find the flink-dist.jar in "/cygdrive/d/Apacheflink/dist/apache-flink-1.9.3/deps/lib". Where is your flink distribution and do you change the directory structure of it? Best, Yangze Guo On Mon, Sep 21, 2020 at 5:31 PM saksham sapra wro

Re: Flink multiple task managers setup

2020-09-17 Thread Yangze Guo
Sorry that the community decided to not maintain it anymore, you could take a look at [1]. [1] https://lists.apache.org/thread.html/r7693d0c06ac5ced9a34597c662bcf37b34ef8e799c32cc0edee373b2%40%3Cdev.flink.apache.org%3E Best, Yangze Guo On Thu, Sep 17, 2020 at 5:21 PM saksham sapra wrote

Re: Flink multiple task managers setup

2020-09-17 Thread Yangze Guo
, Yangze Guo On Thu, Sep 17, 2020 at 4:53 PM saksham sapra wrote: > > HI Yangze, > > I tried to run start-cluster.sh and i can see in host , when flink tries to > run second task manager or executor, pop up or host gets closed. > Please find attached logs for two command

Re: Flink multiple task managers setup

2020-09-17 Thread Yangze Guo
n to figure out what happened. - What is the output when you execute ./bin/start-cluster.sh, could you see two "Starting taskexecutor daemon on host" lines? - Could you see two flink-xxx-taskexecutor-xxx.log in $FLINK_DIST/log? If so, could you share these two log files? Best, Yangze Guo

Re: Flink multiple task managers setup

2020-09-17 Thread Yangze Guo
ost' to it. Then, execute the $FLINK_DIST/bin/start-cluster.sh, you could see a standalone cluster with two TM in your local machine. [1] https://ci.apache.org/projects/flink/flink-docs-release-1.11/ops/deployment/cluster_setup.html#configuring-flink Best, Yangze Guo On Thu, Sep 17, 2020 at 3:16

Re: Use of slot sharing groups causing workflow to hang

2020-09-09 Thread Yangze Guo
our job indeed requests. And probably help to figure out what the ExecutionGraph finally looks like. Best, Yangze Guo On Thu, Sep 10, 2020 at 10:47 AM Ken Krugler wrote: > > Hi Til, > > On Sep 3, 2020, at 12:31 AM, Till Rohrmann wrote: > > Hi Ken, > > I believe that we don

Re: Difficulties with Minio state storage

2020-09-08 Thread Yangze Guo
by the following reasons: - The MinIO is not well configured. - Maybe you need to create a bucket for it first. In my case, I create a bucket called "flink" first. Best, Yangze Guo On Wed, Sep 9, 2020 at 9:33 AM Rex Fenley wrote: > > Hello! > > I'm trying to test out

Re: Use of slot sharing groups causing workflow to hang

2020-09-02 Thread Yangze Guo
in it. Best, Yangze Guo On Thu, Sep 3, 2020 at 4:39 AM Ken Krugler wrote: > > Hi all, > > I’ve got a streaming workflow (using Flink 1.11.1) that runs fine locally > (via Eclipse), with a parallelism of either 3 or 6. > > If I set up part of the workflow to use a specific

  1   2   >