Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Yang Wang
Hi Pankaj, Always using `-yid` to submit a flink job to an existing yarn session cluster is a safe way. For example, `flink run -d -yid application_1234 examples/streaming/WordCount.jar`. Maybe the magic properties file will be removed in the future. Best, Yang Pankaj Chand 于2019年12月13日周五

Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Pankaj Chand
Vino and Kostas: Thank you for the info! I was using Flink 1.9.1 with Pre-bundled Hadoop 2.7.5. Cloudlab has quarantined my cluster experiment without notice , so I'll let you know if and when they allow me to access the files in the future. regards, Pankaj On Thu, Dec 12, 2019 at 8:35 AM

How to understand create watermark for Kafka partitions

2019-12-12 Thread qq
Hi all, I confused with watermark for each Kafka partitions. As I know watermark created by data stream level. But why also say created watermark for each Kafka topic partitions ? As I tested, watermarks also created by global, even I run my job with parallels. And assign watermarks on

Re: State Processor API: StateMigrationException for keyed state

2019-12-12 Thread vino yang
Hi pwestermann, Can you share the relevant detailed exception message? Best, Vino pwestermann 于2019年12月13日周五 上午2:00写道: > I am trying to get the new State Processor API but I am having trouble with > keyed state (this is for Flink 1.9.1 with RocksDB on S3 as the backend). > I can read keyed

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
Ok I think I identified the issue: 1. I accidentally bundled another version of slf4j in my job jar, which results in some incompatibility with the slf4j jar bundled with flink/bin. Apparently slf4j in this case defaults to something that ignores the conf? Once I removed slf4j from my job jar,

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread Li Peng
Hey ouywl, interesting, I figured something like that would happen. I actually replaced all the log4j-x files with the same config I originally posted, including log4j-console, but that didn't change the behavior either. Hey Yang, yes I verified the properties files are as I configured, and that

State Processor API: StateMigrationException for keyed state

2019-12-12 Thread pwestermann
I am trying to get the new State Processor API but I am having trouble with keyed state (this is for Flink 1.9.1 with RocksDB on S3 as the backend). I can read keyed state for simple key type such as Strings but whenever I tried to read state with a more complex key type - such as a named tuple

Join a datastream with tables stored in Hive

2019-12-12 Thread Krzysztof Zarzycki
Hello dear Flinkers, If this kind of question was asked on the groups, I'm sorry for a duplicate. Feel free to just point me to the thread. I have to solve a probably pretty common case of joining a datastream to a dataset. Let's say I have the following setup: * I have a high pace stream of

Re: Flink ML feature

2019-12-12 Thread Rong Rong
Hi guys, Yes, as Till mentioned. The community is working on a new ML library and we are working closely with the Alink project to bring the algorithms. You can find more information regarding the new ML design architecture in FLIP-39 [1]. One of the major change is that the new ML library [2]

Flink1.9.1的SQL向前不兼容的问题

2019-12-12 Thread 李佟
近期进行Flink升级,将原来的程序从老的集群(1.8.0运行正常)迁移到新的集群(1.9.1)中。在部署程序的时候发现在1.9.1的集群中,原来运行正常的Flink SQL的程序无法执行,异常如下: org.apache.flink.table.api.ValidationException: Window can only be defined over a time attribute column. at

Flink 1.9.1版本sql与1.8.x兼容性问题

2019-12-12 Thread 李佟
Hi All: 近期进行Flink升级,将原来的程序从老的集群(1.8.0以及1.8.3运行正常)迁移到新的集群(1.9.1)中。在部署程序的时候发现在1.9.1的集群中,原来运行正常的Flink SQL的程序无法执行,异常如下: org.apache.flink.table.api.ValidationException: Window can only be defined over a time attribute column. at

Re: Flink 1.9.1 KafkaConnector missing data (1M+ records)

2019-12-12 Thread Kostas Kloudas
Hi Harrison, Really sorry for the late reply. Do you have any insight on whether the missing records were read by the consumer and just the StreamingFileSink failed to write their offsets, or the Kafka consumer did not even read them or dropped them for some reason? I asking this in order to

Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Kostas Kloudas
Hi Pankaj, When you start a session cluster with the bin/yarn-session.sh script, Flink will create the cluster and then write a "Yarn Properties file" named ".yarn-properties-YOUR_USER_NAME" in the directory: either the one specified by the option "yarn.properties-file.location" in the

??????countWindow??????????

2019-12-12 Thread cs
??countWindowgloble window?? kafkakafkawindow?? ---- ??:"Jimmy Wong"

Re: Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread vino yang
Hi Pankaj, Can you tell us what's Flink version do you use? And can you share the Flink client and job manager log with us? This information would help us to locate your problem. Best, Vino Pankaj Chand 于2019年12月12日周四 下午7:08写道: > Hello, > > When using Flink on YARN in session mode, each

Re: flink savepoint checkpoint

2019-12-12 Thread Congxian Qiu
Flink 也支持从 retained checkpoint 进行恢复,可以参考文档[1] [1] https://ci.apache.org/projects/flink/flink-docs-release-1.9/ops/state/checkpoints.html#resuming-from-a-retained-checkpoint Best, Congxian 陈帅 于2019年12月11日周三 下午9:34写道: > flink 1.9里面支持cancel job with savepoint功能 > >

??????countWindow??????????

2019-12-12 Thread Jimmy Wong
??windowwindow?? | | Jimmy Wong | | wangzmk...@163.com | ?? ??2019??12??12?? 19:07??cs<58683...@qq.com> ??

Re: Flink on Kubernetes seems to ignore log4j.properties

2019-12-12 Thread ouywl
 @Li Peng    I found your problems.  Your start cmd use args “start-foreground”, It will run “exec "${FLINK_BIN_DIR}"/flink-console.sh ${ENTRY_POINT_NAME} "${ARGS[@]}””, and In ' flink-console.sh’, the code is

Re: Scala case class TypeInformation and Serializer

2019-12-12 Thread Timo Walther
Hi, the serializers are created from TypeInformation. So you can simply inspect the type information. E.g. by using this in the Scala API: val typeInfo = createTypeInformation[MyClassToAnalyze] And going through the object using a debugger. Actually, I don't understand why scala.Tuple2 is

countWindow??????????

2019-12-12 Thread cs
??countWindow

Flink client trying to submit jobs to old session cluster (which was killed)

2019-12-12 Thread Pankaj Chand
Hello, When using Flink on YARN in session mode, each Flink job client would automatically know the YARN cluster to connect to. It says this somewhere in the documentation. So, I killed the Flink session cluster by simply killing the YARN application using the "yarn kill" command. However, when

Re: Sample Code for querying Flink's default metrics

2019-12-12 Thread Pankaj Chand
Thank you, Chesnay! On Thu, Dec 12, 2019 at 5:46 AM Chesnay Schepler wrote: > Yes, when a cluster was started it takes a few seconds for (any) metrics > to be available. > > On 12/12/2019 11:36, Pankaj Chand wrote: > > Hi Vino, > > Thank you for the links regarding backpressure! > > I am

Re: Sample Code for querying Flink's default metrics

2019-12-12 Thread Chesnay Schepler
Yes, when a cluster was started it takes a few seconds for (any) metrics to be available. On 12/12/2019 11:36, Pankaj Chand wrote: Hi Vino, Thank you for the links regarding backpressure! I am currently using code to get metrics by calling REST API via curl. However, many times the REST API

Re: Sample Code for querying Flink's default metrics

2019-12-12 Thread Pankaj Chand
Hi Vino, Thank you for the links regarding backpressure! I am currently using code to get metrics by calling REST API via curl. However, many times the REST API via curl gives an empty JSON object/array. Piped through JQ (for filtering JSON) it produces a null value. This is breaking my code.

Re: Scala case class TypeInformation and Serializer

2019-12-12 Thread 杨光
Actually the original source code have too many third part classes which is hard to simplify , the question I want to ask is there any possible for me to find out which is ser/dser by which Serializer class,then we can tuning or and customer Serializer to improve performance. Yun Tang

回复:窗口去重

2019-12-12 Thread Jimmy Wong
谢谢大家,我想到了解决方案: 情景一:可以每来一条数据就Trigger一次计算,然后再Window计算完的时候,清除状态 情景二:确实要等窗口计算完 | | Jimmy Wong | | wangzmk...@163.com | 签名由网易邮箱大师定制 在2019年12月11日 16:26,yanggang_it_job 写道: 我觉得可以这样处理:1:首先把你的stream流注册为表(不管是一个还是多个stream)2:然后对这个表使用FLINKSQL进行业务表达3:最后使用FLINK

Re: [ANNOUNCE] Apache Flink 1.8.3 released

2019-12-12 Thread Zhu Zhu
Thanks Hequn for driving the release and everyone who makes this release possible! Thanks, Zhu Zhu Wei Zhong 于2019年12月12日周四 下午3:45写道: > Thanks Hequn for being the release manager. Great work! > > Best, > Wei > > 在 2019年12月12日,15:27,Jingsong Li 写道: > > Thanks Hequn for your driving, 1.8.3