Re: Need help using AggregateFunction instead of FoldFunction

2019-12-08 Thread vino yang
Hi dev, The time of the window may have different semantics. In the session window, it's only a time gap, the size of the window is driven via activity events. In the tumbling or sliding window, it means the size of the window. For more details, please see the official documentation.[1] Best,

Re: [flink-sql]使用tableEnv.sqlUpdate(ddl);方式创表,如何指定rowtime?

2019-12-08 Thread JingsongLee
Hi 猫猫: 在DDL上定义rowtime是刚刚支持的功能,文档正在编写中。[1] 你可以通过master的代码来试用,社区正在准备发布1.10,到时候会有release版本可用。 [2] 中有使用的完整例子,FYI。 [1] https://issues.apache.org/jira/browse/FLINK-14320 [2]

Re: Flink实时数仓落Hive一般用哪种方式好?

2019-12-08 Thread JingsongLee
Hi 帅, - 目前可以通过改写StreamingFileSink的方式来支持Parquet。 (但是目前StreamingFileSink支持ORC比较难) - BulkWriter和批处理没有关系,它只是StreamingFileSink的一种概念。 - 如果sync hive分区,这需要自定义了,目前StreamingFileSink没有现成的。 在1.11中,Table层会持续深入这方面的处理,实时数仓落hive,在后续会一一解决数据倾斜、分区可见性等问题。[1] [1] https://issues.apache.org/jira/browse/FLINK-14249

Re: Flink RetractStream如何转成AppendStream?

2019-12-08 Thread JingsongLee
+1 to lucas.wu Best, Jingsong Lee -- From:lucas.wu Send Time:2019年12月9日(星期一) 11:39 To:user-zh Subject:Re: Flink RetractStream如何转成AppendStream? 可以使用类似的方式 // val sstream = result4.toRetractStream[Row],filter(_.1==trye).map(_._2)

Re: Flink RetractStream如何转成AppendStream?

2019-12-08 Thread JingsongLee
Hi 帅, 你可以先把RetractStream转成DataStream,这样就出现了Tuple的stream,然后你再写个MapFunc过滤,最后通过DataStream写入Kafka中。 Best, Jingsong Lee -- From:Jark Wu Send Time:2019年12月8日(星期日) 11:54 To:user-zh Subject:Re: Flink RetractStream如何转成AppendStream? Hi,

Re: [DISCUSS] Adding e2e tests for Flink's Mesos integration

2019-12-08 Thread Yang Wang
Thanks Yangze for starting this discussion. Just share my thoughts. If the mesos official docker image could not meet our requirement, i suggest to build the image locally. We have done the same things for yarn e2e tests. This way is more flexible and easy to maintain. However, i have no idea

Re: KeyBy/Rebalance overhead?

2019-12-08 Thread vino yang
Hi Komal, KeyBy(Hash Partition, logically partition) and rebalance(physical partition) are both one of the partitions been supported by Flink.[1] Generally speaking, partitioning may cause network communication(network shuffles) costs which may cause more time cost. The example provided by you

flink on k8s 如何指定用户程序的入口

2019-12-08 Thread aven . wu
各位好! 关于flink on k8s 看了官网的文档之后Dockerfile,docker-entrypoint.sh,job-cluster-job.yaml.template等文件有以下问题: 1 standalone 启动jobmanager之后是如何知道用户程序的主入口(要执行的main方法时哪个?)如果是通过Maven打包时候设置的,那么如何不在打包时不设置,而通过命令行传入 类似 on yarn 模式下的 -c 2 如果是在 standalone-job.sh 时指定用户程序的主入口,那么如何传入用户自定义参数(在用户主程序args[]中接收)? 发送自 Windows

Sample Code for querying Flink's default metrics

2019-12-08 Thread Pankaj Chand
Hello, Using Flink on Yarn, I could not understand the documentation for how to read the default metrics via code. In particular, I want to read throughput, i.e. CPU usage, Task/Operator's numRecordsOutPerSecond, and Memory. Is there any sample code for how to read such default metrics? Is

Re: Flink RetractStream如何转成AppendStream?

2019-12-08 Thread lucas.wu
可以使用类似的方式 // val sstream = result4.toRetractStream[Row],filter(_.1==trye).map(_._2) // val result5 = tEnv.fromDataStream(sstream) // result5.toAppendStream[Row].print() 原始邮件 发件人:Jark wuimj...@gmail.com 收件人:user-zhuser...@flink.apache.org 发送时间:2019年12月8日(周日) 11:53 主题:Re: Flink

Re: User program failures cause JobManager to be shutdown

2019-12-08 Thread Dongwon Kim
Hi Robert and Roman, Yeah, letting users know System.exit() is called would be much more appropriate than just intercepting and ignoring. Best, Dongwon On Sat, Dec 7, 2019 at 11:29 PM Robert Metzger wrote: > I guess we could manage the security only when calling the user's main() > method. > >

Re: KeyBy/Rebalance overhead?

2019-12-08 Thread Komal Mariam
Anyone? On Fri, 6 Dec 2019 at 19:07, Komal Mariam wrote: > Hello everyone, > > I want to get some insights on the KeyBy (and Rebalance) operations as > according to my understanding they partition our tasks over the defined > parallelism and thus should make our pipeline faster. > > I am

[ANNOUNCE] Weekly Community Update 2019/49

2019-12-08 Thread Konstantin Knauf
Dear community, happy to share this week's community digest with an update on Flink 1.8.3, a revival of the n-ary stream operator, a proposal to move our build infrastructure to Azure pipelines, and quite a few other topics. Enjoy. Flink Development == * [releases] The feature

Re: Emit intermediate accumulator state of a session window

2019-12-08 Thread Rafi Aroch
Hi Chandu, Maybe you can use a custom trigger: * .trigger(**ContinuousEventTimeTrigger.of(Time.minutes(15)))* This would continuously trigger your aggregate every period of time. Thanks, Rafi On Thu, Dec 5, 2019 at 1:09 PM Andrey Zagrebin wrote: > Hi Chandu, > > I am not sure whether

Change Flink binding address in local mode

2019-12-08 Thread Andrea Cardaci
Hi, Flink (or some of its services) listens on three random TCP ports during the local[1] execution, e.g., 39951, 41009 and 42849. [1]: https://ci.apache.org/projects/flink/flink-docs-stable/dev/local_execution.html#local-environment The sockets listens on `0.0.0.0` and since I need to run

unsubscribe

2019-12-08 Thread Deepak Sharma