Re: Flink batch app occasionally hang

2019-10-29 Thread vino yang
Hi Caio, Because it involves interaction with external systems. It would be better if you can provide the full logs. Best, Vino Caio Aoque 于2019年10月30日周三 上午8:31写道: > Hi, I've been running some flink scala applications on an AWS EMR cluster > (version 5.26.0 with flink 1.8.0 for scala 2.11) for

Re: Flink checkpointing behavior

2019-10-29 Thread vino yang
Hi Amran, See my inline answers. Best, Vino amran dean 于2019年10月30日周三 上午2:59写道: > Hello, > Exact semantics for checkpointing/task recovery are still a little > confusing to me after parsing docs: so a few questions. > > - What does Flink consider a task failure? Is it any exception that the >

Re: Add custom fields into Json

2019-10-29 Thread vino yang
Hi, The exception shows your SQL statement has grammatical errors. Please check it again or provide the whole SQL statement here. Best, Vino srikanth flink 于2019年10月29日周二 下午2:51写道: > Hi there, > > I'm querying json data and is working fine. I would like to add custom > fields including the que

Re: Testing AggregateFunction() and ProcessWindowFunction() on KeyedDataStream

2019-10-28 Thread vino yang
erWithClientResource in order to use > StreamExecutionEnvironment? > > > > > > Best, > > Michael > > > > > > *From: *vino yang > *Date: *Monday, October 28, 2019 at 1:32 AM > *To: *Michael Nguyen > *Cc: *"user@flink.apache.org" > *Subjec

Re: Testing AggregateFunction() and ProcessWindowFunction() on KeyedDataStream

2019-10-28 Thread vino yang
Hi Michael, You may need to know `KeyedOneInputStreamOperatorTestHarness` test class. You can consider `WindowTranslationTest#testAggregateWithWindowFunctionEventTime` or `WindowTranslationTest#testAggregateWithWindowFunctionProcessingTime`[1](both of them call `processElementAndEnsureOutput`) as

Re: Cannot modify parallelism (rescale job) more than once

2019-10-28 Thread vino yang
Hi Pankaj, It seems it is a bug. You can report it by opening a Jira issue. Best, Vino Pankaj Chand 于2019年10月28日周一 上午10:51写道: > Hello, > > I am trying to modify the parallelism of a streaming Flink job (wiki-edits > example) multiple times on a standalone cluster (one local machine) having > t

Re: Can flink 1.9.1 use flink-shaded 2.8.3-1.8.2

2019-10-25 Thread vino yang
Hi Jeff, Maybe @Chesnay Schepler could tell you the answer. Best, Vino Jeff Zhang 于2019年10月25日周五 下午3:54写道: > Hi all, > > There's no new flink shaded release for flink 1.9, so I'd like to confirm > with you can flink 1.9.1 use flink-shaded 2.8.3-1.8.2 ? Or the flink shaded > is not necessary f

Re: How to create an empty test stream

2019-10-25 Thread vino yang
erything from the stream. > > -- Dmitry > > On Thu, Oct 24, 2019 at 2:29 AM vino yang wrote: > >> Hi Dmitry, >> >> Perhaps an easy way is to customize a source function. Then in the run >> method, start an empty loop? But I don't understand the meaning of starti

Re: How to create an empty test stream

2019-10-24 Thread vino yang
Hi Dmitry, Perhaps an easy way is to customize a source function. Then in the run method, start an empty loop? But I don't understand the meaning of starting a stream pipeline without generating data. Best, Vino Dmitry Minaev 于2019年10月24日周四 上午6:16写道: > Hi everyone, > > I have a pipeline where

Re: Docker flink 1.9.1

2019-10-24 Thread vino yang
OK, sounds good to me. Farouk 于2019年10月24日周四 下午3:23写道: > Hi > > I checked there is 2 PR on github for Flink 1.9.1. > > It should come soon enough :) > > Thanks > > Farouk > > Le mer. 23 oct. 2019 à 17:46, vino yang a écrit : > >> Hi Farouk, >>

Re: How many events can Flink process each second

2019-10-23 Thread vino yang
Hi A.V. Add a few more points to the previous two answers. There is no clear answer to this question. In addition to resource issues, it depends on the size of the messages you are dealing with and the complexity of the logic. If you don't consider a lot of extra factors, look at the performance o

Re: Docker flink 1.9.1

2019-10-23 Thread vino yang
Hi Farouk, Not long after Flink 1.9.1 was released, the community may not have time to provide the corresponding Dockerfiles. I can give you some information: Flink's official docker file is maintained in this repository. [1] I have seen many versions of docker files contributed by patricklucas[

Re: multi-tenancy without a kafka partition per tenant

2019-10-23 Thread vino yang
Hi Constantinos, I think your analysis is correct, if you have a multi-tenant scenario, but there is no distinction in Kafka. Then Flink can't treat different tenants differently. It is easy to form a data hotspot problem for the difference in the data volume of different tenants. A compromise is

Re: Comparing Storm and Flink resource requirements

2019-10-22 Thread vino yang
Hi Gyula, Based on our previous experience switching from Storm to Flink. For the same business, resources of the same size are completely sufficient, and the performance indicators are slightly better than Storm. As you said, this may be related to using some of Flink's special features like stat

Re: [ANNOUNCE] Kinesis connector becomes part of Flink releases

2019-09-03 Thread vino yang
Good news! Thanks for your efforts, Bowen! Best, Vino Yu Li 于2019年9月2日周一 上午6:04写道: > Great to know, thanks for the efforts Bowen! > > And I believe it worth a release note in the original JIRA, wdyt? Thanks. > > Best Regards, > Yu > > > On Sat, 31 Aug 2019 at 11:01, Bowen Li wrote: > >> Hi all

Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread vino yang
Congratulations! highfei2...@126.com 于2019年8月7日周三 下午7:09写道: > Congrats Hequn! > > Best, > Jeff Yang > > > Original Message > Subject: Re: [ANNOUNCE] Hequn becomes a Flink committer > From: Piotr Nowojski > To: JingsongLee > CC: Biao Liu ,Zhu Zhu ,Zili Chen ,Jeff Zhang ,Paul Lam

Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-11 Thread vino yang
congratulations Rong Rong! Fabian Hueske 于2019年7月11日周四 下午10:25写道: > Hi everyone, > > I'm very happy to announce that Rong Rong accepted the offer of the Flink > PMC to become a committer of the Flink project. > > Rong has been contributing to Flink for many years, mainly working on SQL > and Yar

[DISCUSS] Make window state queryable

2019-07-03 Thread vino yang
Hi folks, Currently, the queryable state is not widely used in production. IMO, there are two key reasons caused this result. 1) the client of the queryable state is hard to use. Because it requires users to know the address of TaskManager and the port of the proxy. Actually, most business users w

Re: RE: [DISCUSS] Improve Queryable State and introduce a QueryServerProxy component

2019-07-02 Thread vino yang
; It will be very useful if someone from the committers joins the topic and > give us some insights what’s going to happen with that feature. > > > > > > Kind Regards, > > Georgi > > > > > > > > *From:* vino yang > *Sent:* Thursday, April 25, 2

Re: [DISCUSS] Deprecate previous Python APIs

2019-06-12 Thread vino yang
+1 from my side Best, Vino Terry Wang 于2019年6月12日周三 下午5:45写道: > +1 for deprecation. It’s very reasonable. > > 在 2019年6月12日,下午5:32,Till Rohrmann 写道: > > +1 for deprecation. > > Cheers, > Till > > On Wed, Jun 12, 2019 at 4:31 AM Hequn Cheng wrote: > >> +1 on the proposal! >> Maintaining only on

Re: [Discuss] Add JobListener (hook) in flink job lifecycle

2019-04-17 Thread vino yang
Hi Jeff, I personally like this proposal. From the perspective of programmability, the JobListener can make the third program more appreciable. The scene where I need the listener is the Flink cube engine for Apache Kylin. In the case, the Flink job program is embedded into the Kylin's executable

Re: [ANNOUNCE] Apache Flink 1.8.0 released

2019-04-10 Thread vino yang
Great news! Thanks Aljoscha for being the release manager and thanks to all the contributors! Best, Vino Driesprong, Fokko 于2019年4月10日周三 下午4:54写道: > Great news! Great effort by the community to make this happen. Thanks all! > > Cheers, Fokko > > Op wo 10 apr. 2019 om 10:50 schreef Shaoxuan Wan

Re: [DISCUSS] Drop Elasticssearch 1 connector

2019-04-09 Thread vino yang
+1 to drop it. Stephan Ewen 于2019年4月6日周六 上午6:51写道: > +1 to drop it > > Previously released versions are still available and compatible with newer > Flink versions anyways. > > On Fri, Apr 5, 2019 at 2:12 PM Bowen Li wrote: > > > +1 for dropping elasticsearch 1 connector. > > > > On Wed, Apr 3,

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread vino yang
Hi Becket, Great idea! +1 for this proposal. Best, Vino Bowen Li 于2019年3月6日周三 上午6:24写道: > Thanks for bring it up, Becket. That sounds very good to me. Spark also > has such a page for ecosystem project > https://spark.apache.org/third-party-projects.html and a hosted website > https://spark-pa

Re: [DISCUSS] Adding a mid-term roadmap to the Flink website

2019-02-26 Thread vino yang
Great job. Stephan! Best, Vino Jamie Grier 于2019年2月27日周三 上午2:27写道: > This is awesome, Stephan! Thanks for doing this. > > -Jamie > > > On Tue, Feb 26, 2019 at 9:29 AM Stephan Ewen wrote: > >> Here is the pull request with a draft of the roadmap: >> https://github.com/apache/flink-web/pull/178

Re: [ANNOUNCE] Apache Flink 1.7.2 released

2019-02-17 Thread vino yang
Great job! Thanks for being the release manager @Gordon. Best, Vino Hequn Cheng 于2019年2月18日周一 下午2:16写道: > Thanks a lot for the great release @Gordon. > Also thanks for the work by the whole community. :-) > > Best, Hequn > > > On Mon, Feb 18, 2019 at 2:12 PM jincheng sun > wrote: > > > Thanks

Re: [ANNOUNCE] Apache Flink 1.5.6 released

2018-12-26 Thread vino yang
Thomas, thanks for being a release manager. And Thanks for the whole community. I think the release of Flink 1.5.6 makes sense for many users who are currently unable to upgrade major versions. Best, Vino jincheng sun 于2018年12月27日周四 上午8:00写道: > Thanks a lot for being our release manager Thomas.

Re: [ANNOUNCE] Apache Flink 1.6.3 released

2018-12-23 Thread vino yang
Thanks for being the release manager Gordon. And glad to see Flink 1.6.3 released. Best, Vino Tzu-Li (Gordon) Tai 于2018年12月23日周日 下午9:35写道: > Hi, > > The Apache Flink community is very happy to announce the release of > Apache Flink 1.6.3, which is the third bugfix release for the Apache > Flink

Re: Connection time out - DataStream API Tutorial

2018-12-17 Thread vino yang
Hi Brett, Is your exception generated on your local machine or generated on a remote node? Best, Vino Brett Marcott 于2018年12月15日周六 下午5:38写道: > Hi Flink users, > > I am attempting to follow the tutorial here: > > https://ci.apache.org/projects/flink/flink-docs-release-1.7/tutorials/datastream_a

Re: Is there an example of flink cluster "as a job" deployment on k8s ?

2018-12-05 Thread vino yang
Hi Vishal, I found an example from Github for reference only. [1] Https://github.com/sanderploegsma/flink-k8s Thanks, vino. Vishal Santoshi 于2018年12月6日周四 上午5:50写道: > >

Re: CKAN inputFormat (batch)

2018-12-03 Thread vino yang
Hi Flavio, I can not open the first link[1] you provided. And what is your purpose? Introduce your CKAN input format to the community? Thanks, vino. [1]: https://ckan.org/about/instances/ Flavio Pompermaier 于2018年12月4日周二 上午1:09写道: > Hi to all, > we've just published an example of a simple CKA

Re: Flink stream with RabbitMQ source: Set max "request" message amount

2018-11-21 Thread vino yang
Hi Marke, AFAIK, you can set *basic.qos* to limit the consumption rate, please read this answer.[1] I am not sure if Flink RabbitMQ connector lets you set this property. You can check it. Thanks, vino. [1]: https://stackoverflow.com/questions/19163021/rabbitmq-how-to-throttle-the-consumer/191638

Re: Exception occurred while processing valve output watermark & NullPointerException

2018-11-20 Thread vino yang
Hi Steve, It seems the NPE caused by the property of the IoTEvent's instance. Can you make sure the property is not null? Thanks, vino. Steve Bistline 于2018年11月21日周三 上午2:09写道: > Any guidance would be most appreciated. > > Thx > > Steve > === > > java.lan

Re: Tentative release date for 1.6.3

2018-11-20 Thread vino yang
Hi Shailesh, Flink 1.7 is about to be released, and many of the problems encountered since Flink 1.6.2 have been fixed in version 1.7. So, I suggest you look forward and upgrade to Flink 1.7. Thanks, vino. Stefan Richter 于2018年11月20日周二 下午10:15写道: > Hi, > > there is no release date for 1.6.3, y

Re: How to use multiple sources with multiple sinks

2018-11-12 Thread vino yang
Hi, If you are expressing a job that contains three pairs of source->sinks that are isolated from each other, then Flink supports this form of Job. It is not much different from a single source->sink, just changed from a DataStream to three DataStreams. For example, *DataStream ds1 = xxx* *ds1.a

Re: Any examples on invoke the Flink REST API post method ?

2018-11-11 Thread vino yang
Hi Henry, Maybe Gary can help you, ping him for you. Thanks, vino. 徐涛 于2018年11月12日周一 下午12:45写道: > HI Experts, > I am trying to trigger a savepoint from Flink REST API on version 1.6 , in > the document it shows that I need to pass a json as a request body > { > "type" : "object”, > "id" : >

Re: how get job id which job run slot

2018-11-08 Thread vino yang
Hi lining, Yes, currently you can't get slot information via the "/taskmanagers/:taskmanagerid" rest API. In addition, please ask questions in the user mailing list. The dev mailing list mainly discusses information related to Flink development. Thanks, vino. lining jing 于2018年11月9日周五 上午5:42写道

Re: Flink Task Allocation on Yarn

2018-11-06 Thread vino yang
way to make better use of the resources of the yarn > cluster, try to allocate tasks to containers on different nodes. > > Thanks, Marvin. > > vino yang 于2018年10月29日周一 下午3:04写道: > >> Hi Marvin, >> >> YARN is a resource management and scheduling framework. &

Re: Split one dataset into multiple

2018-11-05 Thread vino yang
Hi madan, I think you need to hash partition your records. Flink supports hash partitioning of data. The operator is keyBy. If the value of your tag field is enumerable, you can also use split/select to achieve your purpose. Thanks, vino. madan 于2018年11月5日周一 下午6:37写道: > Hi, > > I have a custom

Re: Flink 1.6 job cluster mode, job_id is always 00000000000000000000000000000000

2018-11-04 Thread vino yang
Hi Hao Sun, When you use the Job Cluster mode, you should be sure to isolate the Zookeeper path for different jobs. Ufuk is correct. We fixed the JobID for the purpose of finding JobGraph in failover. In fact, FLINK-10291 should be combined with FLINK-10292[1]. To till, I hope FLINK-10292 can be

Re: Flink fails continuously: Couldn't retrieve the JobExecutionResult from the JobManager.

2018-10-31 Thread vino yang
gt; Any other proposals ? > > Thanks, marke. > > Am Mo., 22. Okt. 2018 um 04:04 Uhr schrieb vino yang < > yanghua1...@gmail.com>: > >> Hi Marke, >> >> Are you expecting your job to quickly return the results of the stream >> calculation? >> If it is ru

Re: Job fails to restore from checkpoint in Kubernetes with FileNotFoundException

2018-10-30 Thread vino yang
Hi John, Is the file system configured by RocksDBStateBackend HDFS?[1] Thanks, vino. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/state/state_backends.html#the-rocksdbstatebackend John Stone 于2018年10月30日周二 上午2:54写道: > I am testing Flink in a Kubernetes cluster and am f

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

2018-10-29 Thread vino yang
, how can I adjust the Hadoop version with the Hadoop > version really used? > Thanks a lot!! > > Best > Henry > > 在 2018年10月26日,上午10:02,vino yang 写道: > > Hi Henry, > > When running flink on YARN, from ClusterEntrypoint the system environment > info is print out. &

Re: Flink Task Allocation on Yarn

2018-10-29 Thread vino yang
Hi Marvin, YARN is a resource management and scheduling framework. When you run Flink on YARN, Flink will hand over the container's scheduling tasks to YARN. This is also the reason why YARN is used. If you want to control the start and stop of TM, then I recommend you use standalone mode and set

Re: How to tune Hadoop version in flink shaded jar to Hadoop version actually used?

2018-10-25 Thread vino yang
Hi Henry, When running flink on YARN, from ClusterEntrypoint the system environment info is print out. One of the info is "Hadoop version: 2.4.1”, I think it is from the flink-shaded-hadoop2 jar. But actually the system Hadoop version is 2.7.2. I want to know is it OK if the version is different?

Re: Task manager count goes the expand then converge process when running flink on YARN

2018-10-24 Thread vino yang
Hi Henry, The phenomenon you expressed is there, this is a bug, but I can't remember its JIRA number. Thanks, vino. 徐涛 于2018年10月24日周三 下午11:27写道: > Hi experts > I am running flink job on YARN in job cluster mode, the job is divided > into 2 tasks, the following are some configs of the job: > pa

Re: Get nothing from TaskManager in UI

2018-10-23 Thread vino yang
Yours > Joshua > > On Tue, Oct 23, 2018 at 7:26 PM vino yang wrote: > >> Hi Joshua, >> >> Which version of Flink are you using? >> >> Thanks, vino. >> >> Joshua Fan 于2018年10月23日周二 下午5:58写道: >> >>> Hi All >>> >>>

Re: Get nothing from TaskManager in UI

2018-10-23 Thread vino yang
Hi Joshua, Which version of Flink are you using? Thanks, vino. Joshua Fan 于2018年10月23日周二 下午5:58写道: > Hi All > > came into new situations, that the UI can show metric data but the data > remains the same all the time after days. So, there are two cases, one is > no data in UI at all, another is

Re: Manual savepoint trigger

2018-10-21 Thread vino yang
Hi Anil, When we trigger a savepoint, the JobManager's CheckpointCoordinator will send an akka message for triggering to all source tasks, which will generate a barrier for the savepoint (checkpoint). I don't know if this explanation is clear enough. Thanks, vino. Anil 于2018年10月22日周一 下午2:21写道:

Re: Manual savepoint trigger

2018-10-21 Thread vino yang
Hi Anil, When you manually trigger a savepoint, it is clear that it will wait for the savepoint to complete. Of course, the behavior of savepoint is consistent with checkpoint. Thanks, vino. Anil 于2018年10月22日周一 下午1:16写道: > A checkpoint is completed when the nth the checkpoint barrier crosses t

Re: Flink fails continuously: Couldn't retrieve the JobExecutionResult from the JobManager.

2018-10-21 Thread vino yang
Hi Marke, Are you expecting your job to quickly return the results of the stream calculation? If it is running for a long time, you can run it in detached mode when you submit the job[1]. It will not cause your client to be blocked and stay connected to the Flink JobManager. Thanks, vino. [1]: h

Re: Cancel with savepoint on yarn-cluster mode can't retrieve savepointDir.

2018-10-21 Thread vino yang
Hi Shu Li, Paul is right, the issue has been fixed. Please upgrade it to the fixed specific version. Thanks, vino. Paul Lam 于2018年10月19日周五 下午2:35写道: > Hi, > > Please see https://issues.apache.org/jira/browse/FLINK-10309. > > Best, > Paul Lam > > 在 2018年10月19日,14:32,郑舒力 写道: > > Hi community. >

Re: ProgramDescription interface

2018-10-21 Thread vino yang
Hi Flavio, Ping Chesnay and Gary for you. Thanks, vino. Flavio Pompermaier 于2018年10月19日周五 下午11:01写道: > Hi to all, > is there any better way to get the list of required parameters by a Flink > job other than implementing ProgramDescription interface and implementing a > long (and somehow struct

Re: Initializing mapstate hangs

2018-10-20 Thread vino yang
Hi Ahmad, Can you try to dump thread info from the Task Manager's JVM instance? Thanks, vino. Ahmad Hassan 于2018年10月20日周六 下午4:24写道: > Flink 1.6.0. Valuestate initialises successful but mapstate hangs > > Regards > > On 20 Oct 2018, at 02:55, vino yang wrote: > > H

Re: Initializing mapstate hangs

2018-10-19 Thread vino yang
Hi Ahmad, Which version of Flink do you use? Thanks, vino. Ahmad Hassan 于2018年10月19日周五 下午11:32写道: > Hi, > > Initializing mapstate hangs in window function. However if i use > valuestate then it is initialized succcessfully. I am using rocksdb to > store the state. > > public class MyWindowFunc

Re: Pyflink

2018-10-19 Thread vino yang
Hi Bing, Ping Chesnay for you. Thanks, vino. Bing Lin 于2018年10月19日周五 上午5:46写道: > Hi, is there anyone working with pyflink streaming that can help me with > import errors, for example when include the paths for my libraries it gives > me a importerror type_check. Has anyone encountered anything

Re: Where are the TaskManagers IPs and Ports stored?

2018-10-15 Thread vino yang
Hi Chris, Please refer to official documentation [1][2]. The tm's RPC port is even dynamic, and Flink does not persist it. You can view them from the "Task Managers" menu on the left side of the Flink web UI. Thanks, vino. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/ops/con

Re: Can't start cluster

2018-10-15 Thread vino yang
Hi Mar_zieh, For questions about Python, you can ask Chesnay, I try to answer. Pyflink.bat is a script for running Flink programs for flink-python*.jar. You can see the implementation of it. Flink supports Jython. Also, if Python is your main programming language, you can try using the Apache Bea

Re: Flink 1.4: Queryable State Client

2018-10-13 Thread vino yang
Hi Seye, It seems that you have conducted an in-depth analysis of this issue. If you think it's a bug or need improvement. Please feel free to create a JIRA issue to track its status. Thanks, vino. Seye Jin 于2018年10月14日周日 上午12:02写道: > I recently upgraded to flink 1.4 from 1.3 and leverage Quer

Re: Are savepoints / checkpoints co-ordinated?

2018-10-13 Thread vino yang
Hi Anand, About "Cancel with savepoint" congxian is right. And for the duplicates, You should use kafka producer transaction (since 0.11) provided EXACTLY_ONCE semantic[1]. Thanks, vino. [1]: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/connectors/kafka.html#kafka-011 Congx

Re: How do I initialize the window state on first run?

2018-10-12 Thread vino yang
Hi Jiayi, If you don't mind, I would like to ask you what kind of situation do you have in this situation? Thanks, vino. bupt_ljy 于2018年10月12日周五 下午1:59写道: > Hi, > >I’m going to run a new Flink program with some initialized window > states. > >I can’t see there is an official way to do

Re: No data issued by flink window after a few hours

2018-10-11 Thread vino yang
t; > > Yours, > > September > > > -- > *发件人:* 潘 功森 > *发送时间:* Friday, October 12, 2018 11:05:49 AM > *收件人:* vino yang > *抄送:* user > *主题:* 答复: No data issued by flink window after a few hours > > > Hi, > > I found the pictures ma

Re: No data issued by flink window after a few hours

2018-10-10 Thread vino yang
tion? > > > > Yours, > > September > > > -- > *发件人:* 潘 功森 > *发送时间:* Wednesday, October 10, 2018 2:44:48 PM > *收件人:* vino yang > *抄送:* user > *主题:* 答复: No data issued by flink window after a few hours > > > Hi, > > > > Cause

Re: Getting NoMethod found error while running job on flink 1.6.1

2018-10-10 Thread vino yang
Hi Chandu, What mode does your Flink run in? In addition, can you check if the flink-metrics-core is included in the classpath of the Flink runtime environment? Thanks, vino. Chandu Kempaiah 于2018年10月11日周四 上午9:51写道: > > Hello, > > I am have a job that reads messages from kafka, processes them

Re: [DISCUSS] Integrate Flink SQL well with Hive ecosystem

2018-10-10 Thread vino yang
Hi Xuefu, Appreciate this proposal, and like Fabian, it would look better if you can give more details of the plan. Thanks, vino. Fabian Hueske 于2018年10月10日周三 下午5:27写道: > Hi Xuefu, > > Welcome to the Flink community and thanks for starting this discussion! > Better Hive integration would be re

Re: No data issued by flink window after a few hours

2018-10-09 Thread vino yang
you need to reserve enough memory for Flink. Thanks, vino. 潘 功森 于2018年10月10日周三 上午11:36写道: > Please have a look about my last mail. > > > > When the cached window data is too large, how? > > > > Yours, > > September > > > ----

Re: No data issued by flink window after a few hours

2018-10-09 Thread vino yang
t the distinct users, I need to cache the user id > about one hour. > > Cause there’re no related errors. > > Yours, > > September > > > -- > *发件人:* vino yang > *发送时间:* Wednesday, October 10, 2018 10:49:43 AM > *抄送:* user > *主题

Re: flink:latest container on kubernetes fails to connect taskmanager to jobmanager

2018-10-09 Thread vino yang
Hi jwatte, Maybe Till can help you. Thanks, vino. jwatte 于2018年10月2日周二 上午5:30写道: > It turns out that the latest flink:latest docker image is 5 days old, and > thus bug was fixed 4 days ago in the flink-docker github. > > The problem is that the docker-entrypoint.sh script chains to jobmanager.

Re: No data issued by flink window after a few hours

2018-10-09 Thread vino yang
Hi, Can you explain what "ram to cache the distinct data about sliding window" mean? The information you provide is too small and will not help others to help you analyze the problem and provide advice. In addition, regarding the usage of Flink related issues, please only send mail to the user ma

Re: Measure task execution time

2018-09-29 Thread vino yang
Hi, Can you express it more clearly? Whether you measure the execution time of the job or the execution time of the task instance. Why can't you measure the kind of scene you said? All jobs are logically a DAG. Thanks, vino. Alejandro 于2018年9月26日周三 下午4:17写道: > Hello, > > I am trying to measure

Re: Job failing during restore in different cluster

2018-09-29 Thread vino yang
Hi, Which version of Flink do you use? Also, can you give more details about how you migrate your job? Thanks, vino. shashank734 于2018年9月28日周五 下午10:21写道: > Hi, I am trying to move my job from one cluster to another cluster using > Savepoint. But It's failing while restoring on the new cluster.

Re: flink-1.6.1-bin-scala_2.11.tgz fails signture and hash verification

2018-09-29 Thread vino yang
Hi Gianluca, This is very strange, Till may be able to give an explanation, because it is the release manager of this version. Thanks, vino. Gianluca Ortelli 于2018年9月28日周五 下午4:02写道: > Hi, > > I just downloaded flink-1.6.1-bin-scala_2.11.tgz from > https://flink.apache.org/downloads.html and no

Re: [DISCUSS] Dropping flink-storm?

2018-09-28 Thread vino yang
Hi, +1, I agree. In addition, some users ask questions about the integration of Storm compatibility mode with the newer Flink version on the mailing list. It seems that they are not aware that some of Flink's new features are no longer available in Storm compatibility mode. This can be confusing

Re: Does Flink SQL "in" operation has length limit?

2018-09-28 Thread vino yang
Hi Henry, Maybe the number of elements in your IN clause is out of range? Its default value is 20, you can modify it with this configuration item: *withInSubQueryThreshold(XXX)* This API comes from Calcite. Thanks, vino. 徐涛 于2018年9月28日周五 下午4:23写道: > Hi, > > When I am executing the follow

Re: Regarding implementation of aggregate function using a ProcessFunction

2018-09-28 Thread vino yang
onReturnType( > function, input.getType(), null, false); > > return aggregate(function, accumulatorType, resultType); > } > > > Kindly, check above snapshot of flink;s aggregate() method, that got > applied on windowed stream. > > Thanks & Regards > Gaurav Luthra > Mo

Re: Regarding implementation of aggregate function using a ProcessFunction

2018-09-28 Thread vino yang
n not use state. > > Thanks & Regards > Gaurav > > > > On Fri, 28 Sep, 2018, 12:40 PM vino yang, wrote: > >> Hi Gaurav, >> >> Why do you think the RichAggregateFunction cannot access the State API? >> RichAggregateFunction inherits from AbstractRichFu

Re: Regarding implementation of aggregate function using a ProcessFunction

2018-09-28 Thread vino yang
Hi Gaurav, Why do you think the RichAggregateFunction cannot access the State API? RichAggregateFunction inherits from AbstractRichFunction (it provides a RuntimeContext that allows you to access the state API). Thanks, vino. Gaurav Luthra 于2018年9月28日周五 下午1:38写道: > Hi, > > As we are aware, Cur

Re: JobManager in HA with a single node loses leadership

2018-09-27 Thread vino yang
Hi Julio, Which version of Flink are you using? If it is 1.5+, then you can try to increase the heartbeat timeout by configuring it[1]. In addition, the possible cause is that the load of tm is too heavy, for example, because the Full GC causes JVM stalls, or deadlocks and other issues may cause h

Re: How to join stream and batch data in Flink?

2018-09-25 Thread vino yang
gt; mentioned. For join, it doesn't guarantee output orders. You have to do > orderBy if you want to get ordered results. > > Best, Hequn > > On Tue, Sep 25, 2018 at 8:36 PM vino yang wrote: > >> Hi Fabian, >> >> I may not have stated it here, and there is

Re: How to join stream and batch data in Flink?

2018-09-25 Thread vino yang
in my opinion the > implementation is also mature. > However, you should not use the non-windowed join if any of the input > tables is ever growing because both sides must be hold in state. This is > not an issue of the semantics. > > Cheers, > Fabian > > Am Di., 25. Sep. 20

Re: How to join stream and batch data in Flink?

2018-09-25 Thread vino yang
match the results, because the data belonging > to the mysql table is just beginning to play as a stream*” Why it is not > able to match the results? > > Best > Henry > > 在 2018年9月25日,下午5:29,vino yang 写道: > > Hi Henry, > > If you have converted the mysql table to a

Re: How to join stream and batch data in Flink?

2018-09-25 Thread vino yang
nd of join will be supported in FLINK-9712 > <https://issues.apache.org/jira/browse/FLINK-9712>. You can check more > details in the jira. > > Best, Hequn > > On Fri, Sep 21, 2018 at 4:51 PM vino yang wrote: > >> Hi Henry, >> >> There are three ways I can think of: >

Re: When should the RETAIN_ON_CANCELLATION option be used?

2018-09-25 Thread vino yang
> fails,because the checkpoint will not be deleted, I still can have a > checkpoint that can be used to resume. > Please help to correct me if I am wrong. > > Thanks. > > Best > Henry > > 在 2018年9月25日,下午2:22,vino yang 写道: > > Hi Henry, > > I gave a blue comment in yo

Re: When should the RETAIN_ON_CANCELLATION option be used?

2018-09-24 Thread vino yang
use. Your main concern is ExternalizedCheckpointCleanup, which cleans up the metadata for externalized checkpoints. Are you sure you want to use it? Flink defaults to self-management checkpoint cleanup, which is a non-externalized checkpoint. > Best > Henry > > > 在 2018年9月25日,上午11:16

Re: When should the RETAIN_ON_CANCELLATION option be used?

2018-09-24 Thread vino yang
Hi Henry, Answer your question: What is the definition and difference between job cancel and job fails? > The cancellation and failure of the job will cause the job to enter the termination state. But cancellation is artificially triggered and normally terminated, while failure is usually a pass

Re: 1.5 Checkpoint metadata location

2018-09-24 Thread vino yang
Hi Bryant, Maybe Stefan can answer your question, ping him for you. Thanks, vino. Bryant Baltes 于2018年9月25日周二 上午12:29写道: > Hi All, > > After upgrading from 1.3.2 to 1.5.2, one of our apps that uses > checkpointing no longer writes metadata files to the state.checkpoints.dir > location provided

Re: Flink not running properly.

2018-09-23 Thread vino yang
Hi, According to the instructions in the script: # Long.MAX_VALUE in TB: This is an upper bound, much less direct memory will be used TM_MAX_OFFHEAP_SIZE="8388607T" I think you may need to confirm if your operating system and the JDK you installed on the TM are 64-bit. Thanks, vino. Sarabjyo

Re: Between Checkpoints in Kafka 11

2018-09-23 Thread vino yang
Hi Harshvardhan, In fact, Flink does not cache data between two checkpoints. In fact, Flink only calls different operations at different points in time. These operations are provided by the Kafka client, so you should have a deeper understanding of the principles of Kafka producer transactions. I

Re: How to join stream and batch data in Flink?

2018-09-21 Thread vino yang
Hi Henry, There are three ways I can think of: 1) use DataStream API, implement a flatmap UDF to access dimension table; 2) use table/sql API, implement a UDTF to access dimension table; 3) customize the table/sql join API/statement's implementation (and change the physical plan) Thanks, vino.

Re: Flink can't initialize operator state backend when starting from checkpoint

2018-09-21 Thread vino yang
Hi Qingxiang, Several days ago, Stefan described the causes of this anomaly in a problem similar to this: Typically, these problems have been observed when something was wrong with a serializer or a stateful serializer was used from multiple threads. Thanks, vino. Marvin777 于2018年9月21日周五 下午3:20

Re: Strange behaviour with checkpointing and custom FilePathFilter

2018-09-20 Thread vino yang
Hi Averell, Is this all the custom code for "CustomFileSource"? If not, can you share the entire file with us, and if you can set the log level to DEBUG, it will help you analyze and locate the problem. If you can't come to a conclusion, you can share the log with us. Thanks, vino. Averell 于20

Re: HA failing for 1.6.0 job cluster with docker-compose

2018-09-20 Thread vino yang
Till > > On Thu, Sep 20, 2018 at 10:12 AM vino yang wrote: > >> Hi Tzanko, >> >> Maybe Till is more appropriate to answer this question. >> >> Thanks, vino. >> >> Tzanko Matev 于2018年9月19日周三 下午5:47写道: >> >>> Dear all, >>> >

Re: job submitting hanging

2018-09-20 Thread vino yang
Hi Jason, Chesnay and Gary are familiar with the rest API, maybe they can help you. If you can share the client's commit log and set the client and JM log level to DEBUG it would be better. Thanks, vino. Jason Chang 于2018年9月19日周三 上午8:31写道: > Hi all, > > Our team is setting up Flink on k8s. Aft

Re: Checkpointing not working

2018-09-20 Thread vino yang
Hi Yubraj, Can you set your log print level to DEBUG and share it with us or share a screenshot of your Flink web UI checkpoint information? Thanks, vino. Jörn Franke 于2018年9月19日周三 下午2:37写道: > What do the logfiles say? > > How does the source code looks like? > > Is it really needed to do chec

Re: HA failing for 1.6.0 job cluster with docker-compose

2018-09-20 Thread vino yang
Hi Tzanko, Maybe Till is more appropriate to answer this question. Thanks, vino. Tzanko Matev 于2018年9月19日周三 下午5:47写道: > Dear all, > > I am currently experimenting with a Flink 1.6.0 job cluster. The goal is > to run a streaming job on K8s. Right now I am using docker-compose to > experiment wi

Re: Errors in QueryableState sample code?

2018-09-19 Thread vino yang
Hi Ken, About the first question, the way of fixing is right. About the second question, you are right, the "response.get()" missed 'ValueState<>'. Can you create a JIRA issue or fix it if you want. Thanks, vino. Ken Krugler 于2018年9月20日周四 上午6:27写道: > Hi all, > > I was looking at the Example

Re: Which mode to choose flink on Yarn.

2018-09-19 Thread vino yang
Hi weilong, As you said, there are advantages and disadvantages to each of the two approaches. However, I hope you know that the "single job" mode has a huge advantage over the "YARN flink session" mode in that it provides job-level isolation (whether JM or TM). This will allow the Job to be more

Re: Add operator ids for an already running job

2018-09-18 Thread vino yang
Hi Paul, Referring to the Flink official documentation, it seems that this is not possible at this time.[1] Someone has previously discussed the state of the savepoint in the dev mailing list. Perhaps this method can meet your needs after implementation. [2] Thanks, vino. [1]: https://ci.apache

Re: In which case the StreamNode has multiple output edges?

2018-09-17 Thread vino yang
Hi tao, The Dataflow abstraction of Flink runtime is a DAG. In a graph, there may be more than one in-edge and one out-edge. A simple example of multiple out margins is that an operator is followed by multiple sinks. For example, a sink to kafka and a sink to elasticsearch. Thanks, vino. 徐涛 于20

Re: LocalEnvironment and Python streaming

2018-09-16 Thread vino yang
Hi Joe, Maybe Chesnay is better suited to answer this question, Ping him for you. Thanks, vino. Joe Malt 于2018年9月15日周六 上午1:51写道: > Hi, > > Is there any way to execute a job using the LocalEnvironment when using > the Python streaming API? This would make it much easier to debug jobs. > > At th

<    1   2   3   4   5   >