flink hive batch作业报FileNotFoundException

2021-05-27 Thread bowen li
Hi,大家好 我执行的是batch table写入hive时,会出现FileNotFound的错误,找不到.staging文件 版本是 1.12.1 搭建方式是 standalone 报错信息如下: 11:28 Caused by: java.lang.Exception: Failed to finalize execution on master ... 33 more Caused by: org.apache.flink.table.api.TableException: Exception in finalizeGlobal at

flink集群提交任务挂掉

2021-04-01 Thread bowen li
Hi,大家好: 现在我们遇到的场景是这样的,提交任务的时候会报错。我们使用的版本是1.12.1,搭建模式是standalone的。下面是报错信息。 java.lang.OutOfMemoryError: Direct buffer memory. The direct out-of-memory error has occurred. This can mean two things: either job(s) require(s) a larger size of JVM direct memory or there is a direct memory

Re: 使用Flink1.10.0读取hive时source并行度问题

2020-03-05 Thread Bowen Li
@JingsongLee 把当前的hive sink并发度配置策略加到文档里吧 https://issues.apache.org/jira/browse/FLINK-16448 On Tue, Mar 3, 2020 at 9:31 PM Jun Zhang <825875...@qq.com> wrote: > > 嗯嗯,其实我觉得我写的这个示例sql应该是一个使用很广泛的sql,我新建了hive表,并且导入了数据之后,一般都会使用类似的sql来验证一下表建的对不对,数据是否正确。 > > > > > > > 在2020年03月4日 13:25,JingsongLee

Re: [ANNOUNCE] Jingsong Lee becomes a Flink committer

2020-02-21 Thread Bowen Li
Congrats, Jingsong! On Fri, Feb 21, 2020 at 7:28 AM Till Rohrmann wrote: > Congratulations Jingsong! > > Cheers, > Till > > On Fri, Feb 21, 2020 at 4:03 PM Yun Gao wrote: > >> Congratulations Jingsong! >> >>Best, >>Yun >> >>

Re: Flink connect hive with hadoop HA

2020-02-10 Thread Bowen Li
Hi sunfulin, Sounds like you didn't config the hadoop HA correctly on the client side according to [1]. Let us know if it helps resolve the issue. [1] https://stackoverflow.com/questions/25062788/namenode-ha-unknownhostexception-nameservice1 On Mon, Feb 10, 2020 at 7:11 AM Khachatryan Roman

Re: Flink connect hive with hadoop HA

2020-02-10 Thread Bowen Li
Hi sunfulin, Sounds like you didn't config the hadoop HA correctly on the client side according to [1]. Let us know if it helps resolve the issue. [1] https://stackoverflow.com/questions/25062788/namenode-ha-unknownhostexception-nameservice1 On Mon, Feb 10, 2020 at 7:11 AM Khachatryan Roman

Re: Read data from Oracle using Flink SQL API

2020-02-05 Thread Bowen Li
Hi Flavio, +1 for adding Oracle (potentially more dbms like SqlServer, etc) to flink-jdbc. Would you mind open a parent ticket and some subtasks, each one for one to-be-added dbms you've thought of? On Sun, Feb 2, 2020 at 10:11 PM Jingsong Li wrote: > Yes, And I think we should add

Re: batch job OOM

2020-01-24 Thread Bowen Li
Hi Fanbin, You can install your own Flink build in AWS EMR, and it frees you from Emr’s release cycles On Thu, Jan 23, 2020 at 03:36 Jingsong Li wrote: > Fanbin, > > I have no idea now, can you created a JIRA to track it? You can describe > complete SQL and some data informations. > > Best, >

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Bowen Li
Congrats! On Thu, Jan 16, 2020 at 13:45 Peter Huang wrote: > Congratulations, Dian! > > > Best Regards > Peter Huang > > On Thu, Jan 16, 2020 at 11:04 AM Yun Tang wrote: > >> Congratulations, Dian! >> >> Best >> Yun Tang >> -- >> *From:* Benchao Li >> *Sent:*

Re: [ANNOUNCE] Dian Fu becomes a Flink committer

2020-01-16 Thread Bowen Li
Congrats! On Thu, Jan 16, 2020 at 13:45 Peter Huang wrote: > Congratulations, Dian! > > > Best Regards > Peter Huang > > On Thu, Jan 16, 2020 at 11:04 AM Yun Tang wrote: > >> Congratulations, Dian! >> >> Best >> Yun Tang >> -- >> *From:* Benchao Li >> *Sent:*

Re: flink sql confluent schema avro topic注册成表

2020-01-07 Thread Bowen Li
Hi 陈帅, 这是一个非常合理的需求。我们需要开发一个 Flink ConfluentSchemaRegistryCatalog 完成元数据的获取。社区希望的用户体验是用户只需要给出confluent schema registry的链接,Flink SQL可以通过 ConfluentSchemaRegistryCatalog自动获取读写所需的信息,不再需要用户手动写DDL和format。 社区内部已经开始讨论了,我们应该会在1.11中完成,请关注 https://issues.apache.org/jira/browse/FLINK-12256 On Wed, Dec 18,

Re: [DISCUSS] What parts of the Python API should we focus on next ?

2019-12-19 Thread Bowen Li
- integrate PyFlink with Jupyter notebook - Description: users should be able to run PyFlink seamlessly in Jupyter - Benefits: Jupyter is the industrial standard notebook for data scientists. I’ve talked to a few companies in North America, they think Jupyter is the #1 way to empower

Re: [DISCUSS] What parts of the Python API should we focus on next ?

2019-12-19 Thread Bowen Li
- integrate PyFlink with Jupyter notebook - Description: users should be able to run PyFlink seamlessly in Jupyter - Benefits: Jupyter is the industrial standard notebook for data scientists. I’ve talked to a few companies in North America, they think Jupyter is the #1 way to empower

Re: [DISCUSS] have separate Flink distributions with built-in Hive dependencies

2019-12-13 Thread Bowen Li
cc user ML in case anyone want to chime in On Fri, Dec 13, 2019 at 00:44 Bowen Li wrote: > Hi all, > > I want to propose to have a couple separate Flink distributions with Hive > dependencies on specific Hive versions (2.3.4 and 1.2.1). The distributions > will be provided to

Re: [ANNOUNCE] Launch of flink-packages.org: A website to foster the Flink Ecosystem

2019-11-19 Thread Bowen Li
Great work, glad to see this finally happening! On Tue, Nov 19, 2019 at 6:26 AM Robert Metzger wrote: > Thanks. > > I added a ticket for this nice idea: > https://github.com/ververica/flink-ecosystem/issues/84 > > On Tue, Nov 19, 2019 at 11:29 AM orips wrote: > >> This is great. >> >> Can we

Re: [ANNOUNCE] Weekly Community Update 2019/45

2019-11-10 Thread Bowen Li
by *Sreekanth Krishnavajjala & Vinod Kataria (AWS)* > includes a hands-on introduction to Apache Flink on AWS EMR. [7] > * Upcoming Meetups > * At the next Athens Big Data Group on the 14th of November *Chaoran > Yu *of Lightbend will talk about Flink and Spark on Kubernete

Re: Streaming write to Hive

2019-09-05 Thread Bowen Li
we can > try contributing? > > +Yufei and Chang who are also interested in this. > > Thanks, > Qi > > On Thu, Sep 5, 2019 at 12:16 PM Bowen Li wrote: > >> Hi Qi, >> >> With 1.9 out of shelf, I'm afraid not. You can make HiveTableSink >> implements AppendStre

Re: Streaming write to Hive

2019-09-04 Thread Bowen Li
Hi Qi, With 1.9 out of shelf, I'm afraid not. You can make HiveTableSink implements AppendStreamTableSink (an empty interface for now) so it can be picked up in streaming job. Also, streaming requires checkpointing, and Hive sink doesn't do that yet. There might be other tweaks you need to make.

Re: kinesis table connector support

2019-09-02 Thread Bowen Li
@Fanbin, I don't think there's one yet. Feel free to create a ticket and submit a PR for it On Mon, Sep 2, 2019 at 8:13 AM Biao Liu wrote: > Hi Fanbin, > > I'm not familiar with table module. Maybe someone else could help. > > @jincheng sun > Do you know there is any plan for kinesis table

[ANNOUNCE] Kinesis connector becomes part of Flink releases

2019-08-30 Thread Bowen Li
Hi all, I'm glad to announce that, as #9494 was merged today, flink-connector-kinesis is officially of Apache 2.0 license now in master branch and its artifact will be deployed to Maven central as part of Flink releases starting from Flink 1.10.0. Users

Re: [ANNOUNCE] Andrey Zagrebin becomes a Flink committer

2019-08-14 Thread Bowen Li
Congratulations Andrey! On Wed, Aug 14, 2019 at 10:18 PM Rong Rong wrote: > Congratulations Andrey! > > On Wed, Aug 14, 2019 at 10:14 PM chaojianok wrote: > > > Congratulations Andrey! > > At 2019-08-14 21:26:37, "Till Rohrmann" wrote: > > >Hi everyone, > > > > > >I'm very happy to announce

[ANNOUNCE] Seattle Flink Meetup at Uber on 8/22

2019-08-12 Thread Bowen Li
Hi All ! Join our next Seattle Flink Meetup at Uber Seattle, featuring talks of [Flink + Kappa+ @ Uber] and [Flink + Pulsar for streaming-first, unified data processing]. - TALK #1: Moving from Lambda and Kappa Architectures to Kappa+ with Flink at Uber - TALK #2: When Apache Pulsar meets Apache

Re: Status of the Integration of Flink with Hive

2019-08-12 Thread Bowen Li
features seems to be in development. > Some really cool features have been described here: > https://fr.slideshare.net/BowenLi9/integrating-flink-with-hive-xuefu-zhang-and-bowen-li-seattle-flink-meetup-feb-2019 > My first need is to read and update Hive metadata. > Concerning the Hive data I can store th

Re: Cannot access the data from Hive-Tables in Blink

2019-07-17 Thread Bowen Li
Hi Yebgenya, This is caused by Hive version mismatch, you are either not using the right Hive version (double check your Hive version is supported by Blink), or not specifying the right version in yaml config (e.g. you use 2.3.4 but specify it as 1.2.1). Bowen On Tue, Jul 16, 2019 at 11:22 AM

Re: [ANNOUNCE] Rong Rong becomes a Flink committer

2019-07-11 Thread Bowen Li
Congrats, Rong! On Thu, Jul 11, 2019 at 10:48 AM Oytun Tez wrote: > Congratulations Rong! > > --- > Oytun Tez > > *M O T A W O R D* > The World's Fastest Human Translation Platform. > oy...@motaword.com — www.motaword.com > > > On Thu, Jul 11, 2019 at 1:44 PM Peter Huang > wrote: > >>

Re: Hive in sql-client

2019-07-08 Thread Bowen Li
Hi Yebgenya, To use Blink's integration with Hive in SQL CLI, you can reference Blink's documentation at [1], [2], and [3] Note that Hive integration is actually available in **Flink master branch** now and will be released soon as part of Flink 1.9.0. The end-to-end integration should be

Re:

2019-07-08 Thread Bowen Li
Hi Xuchen, Every email in our ML asking questions **MUST** have a valid subject, to facilitate archive search in the future and save people's time to decide whether they can help answer your question or not by just glimpsing the subject thru their email clients. Though your question itself is

Re: Source Kafka and Sink Hive managed tables via Flink Job

2019-07-04 Thread Bowen Li
your questions in our mail in yellow. > > Thank you > > Kind regards > > -Original Message- > > From: Bowen Li [mailto:bowenl...@gmail.com] > > Sent: Wednesday, July 03, 2019 9:34 PM > > To: dev; youssef.achb...@euranova.eu > > Subject: Re: Source Ka

Re: Source Kafka and Sink Hive managed tables via Flink Job

2019-07-03 Thread Bowen Li
BTW, I'm adding user@ mailing list since this is a user question and should be asked there. dev@ mailing list is only for discussions of Flink development. Please see https://flink.apache.org/community.html#mailing-lists On Wed, Jul 3, 2019 at 12:34 PM Bowen Li wrote: > Hi Youssef, >

Re: [External] Flink 1.7.1 on EMR metrics

2019-06-01 Thread Bowen Li
To answer your question on your debugging code, your reporter has a bug: log.info("STATSD SENDING: ", name, value); should be -> log.info("STATSD SENDING: {} {}", name, value); - On Sat, Jun 1, 2019 at 7:30 PM Padarn Wilson wrote: > Thanks both: Using the the inbuilt Slf4j reporter is a

[ANNOUNCE] Seattle Flink Meetup at AWS on May 30

2019-05-20 Thread Bowen Li
Hi Greater Seattle folks! We are hosting our next meetup with AWS Kinesis Analytics team on May 30 next Thursday in downtown Seattle. We feature two talks this time: 1. *"AWS Kinesis Analytics: running Flink serverless in multi-tenant environment"* by Kinesis Analytics team on: -

Re: Flink 与 Hive 集成问题

2019-05-14 Thread Bowen Li
Hi, 我们正在做 Flink-Hive 平台级的元数据和数据的集成,你可以关注下: flink-connector-hive module, Hive元数据 FLINK-11479 ,Hive数据 FLINK-10729

Re: [DISCUSS] Drop Elasticssearch 1 connector

2019-04-05 Thread Bowen Li
+1 for dropping elasticsearch 1 connector. On Wed, Apr 3, 2019 at 5:10 AM Chesnay Schepler wrote: > Hello everyone, > > I'm proposing to remove the connector for elasticsearch 1. > > The connector is used significantly less than more recent versions (2&5 > are downloaded 4-5x more), and hasn't

Re: Re:[进度更新] [讨论] Flink 对 Hive 的兼容 和 Catalogs

2019-03-29 Thread Bowen Li
尝试flink > > on hive > > > - > > > *各位使用Flink-Hive的动机是什么?只维护一套数据处理系统?使用Flink获取更好的性能?*//技术迭代,当然理想的状况是批流统一,只维护一套数据处理系统。spark的性能已经很棒了,所以追求更好的性能这个对我们不需要。 > > > - *各位如何使用Hive?数据量有多大?主要是读,还是读写都有?*//大的表 数据量不小,主要是读 > > > - *有多少Hive UDF?都是什么类型?*//挺多 >

Re: [PROGRESS UPDATE] [DISCUSS] Flink-Hive Integration and Catalogs

2019-03-20 Thread Bowen Li
feedbacks on Flink-Hive integration." * > > Regards, > Shaoxuan > > On Wed, Mar 20, 2019 at 7:16 AM Bowen Li wrote: > >> Hi Flink users and devs, >> >> We want to get your feedbacks on integrating Flink with Hive. >> >> Background: In Flink Forwar

Re: [PROGRESS UPDATE] [DISCUSS] Flink-Hive Integration and Catalogs

2019-03-20 Thread Bowen Li
feedbacks on Flink-Hive integration." * > > Regards, > Shaoxuan > > On Wed, Mar 20, 2019 at 7:16 AM Bowen Li wrote: > >> Hi Flink users and devs, >> >> We want to get your feedbacks on integrating Flink with Hive. >> >> Background: In Flink Forwar

[进度更新] [讨论] Flink 对 Hive 的兼容 和 Catalogs

2019-03-19 Thread Bowen Li
Flink中文频道的童鞋们,大家好, *我们想收集下大家对Flink兼容Hive方面的需求和意见*。 背景:去年12月的Flink Forward 中国站上,社区宣布了将推动Flink兼容Hive。今年2.21,在西雅图 Flink Meetup 上我们做了 “Integrating Flink with Hive” 的演讲,并进行了现场演示,收到很好的反响。现在已到三月中,我们已经在内部完成了构建Flink崭新的catalog架构,对Hive 元数据的兼容,和常见的通过Flink 读写

[PROGRESS UPDATE] [DISCUSS] Flink-Hive Integration and Catalogs

2019-03-19 Thread Bowen Li
nts/258723322/>, We presented Integrating Flink with Hive <https://www.slideshare.net/BowenLi9/integrating-flink-with-hive-xuefu-zhang-and-bowen-li-seattle-flink-meetup-feb-2019> with a live demo to local community and got great response. As of mid March now, we have internally finished buil

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-08 Thread Bowen Li
Confluent hub for Kafka is another good example of this kind. I personally like it over the spark site. May worth checking it out with Kafka folks On Thu, Mar 7, 2019 at 6:06 AM Becket Qin wrote: > Absolutely! Thanks for the pointer. I'll submit a PR to update

Re: [DISCUSS] Create a Flink ecosystem website

2019-03-05 Thread Bowen Li
Thanks for bring it up, Becket. That sounds very good to me. Spark also has such a page for ecosystem project https://spark.apache.org/third-party-projects.html and a hosted website https://spark-packages.org/ with metadata, categories/tags and stats mentioned in the doc. Bowen On Tue, Mar 5,

Re: TimeZone shift problem in Flink SQL

2019-01-24 Thread Bowen Li
Hi, Did you consider timezone in conversion in your UDF? On Tue, Jan 22, 2019 at 5:29 AM 徐涛 wrote: > Hi Experts, > I have the following two UDFs, > unix_timestamp: transform from string to Timestamp, with the > arguments (value:String, format:String), return Timestamp >

Re: [DISCUSS] Towards a leaner flink-dist

2019-01-24 Thread Bowen Li
+1 for leaner distribution and a better 'download' webpage. +1 for a full distribution if we can automate it besides supporting the leaner one. If we support both, I'd image release managers should be able to package two distributions with a single change of parameter instead of manually package

Re: [ANNOUNCE] Apache Flink 1.5.2 released

2018-07-31 Thread Bowen Li
Congratulations, community! On Tue, Jul 31, 2018 at 1:44 AM Chesnay Schepler wrote: > The Apache Flink community is very happy to announce the release of Apache > Flink 1.5.2, which is the second bugfix release for the Apache Flink 1.5 > series. > > Apache Flink® is an open-source stream

Re: Is KeyedProcessFunction available in Flink 1.4?

2018-07-19 Thread Bowen Li
Hi Anna, KeyedProcessFunction is only available starting from Flink 1.5. The doc is here . It extends ProcessFunction and shares the same functionalities except giving more

Re: Is Flink using even-odd versioning system

2018-07-10 Thread Bowen Li
Hi Alexander, AFAIK, Flink releases don't do that. The community has done its best to ensure every release is at its best state. Thanks, Bowen On Tue, Jul 10, 2018 at 4:54 AM Alexander Smirnov < alexander.smirn...@gmail.com> wrote: > to denote development and stable releases? >

Re: Flink and AWS S3 integration: java.lang.NullPointerException: null uri host

2018-05-30 Thread Bowen Li
Did you run Flink on AWS EMR or somewhere else? Have you read and followed instructions on https://ci.apache.org/projects/flink/flink-docs-release-1.5/ops/deployment/aws.html#amazon-web-services-aws ? On Wed, May 30, 2018 at 7:08 AM, Fabian Wollert wrote: > Hi, I'm trying to set up

Re: [ANNOUNCE] Apache Flink 1.5.0 release

2018-05-28 Thread Bowen Li
Congratulations, everyone! On Mon, May 28, 2018 at 1:15 AM, Fabian Hueske wrote: > Thank you Till for serving as a release manager for Flink 1.5! > > 2018-05-25 19:46 GMT+02:00 Till Rohrmann : > > > Quick update: I had to update the date of the release blog post which > also > > changed the

Re: Clarification in TumblingProcessing TimeWindow Documentation

2018-05-28 Thread Bowen Li
Hi Dhruv, I can see it's confusing, and it does seem the comment should be improved. You can find concrete explanation of tumbling window and relative arguments at https://ci.apache.org/projects/flink/flink-docs-master/dev/stream/operators/windows.html#tumbling-windows Feel free to open a PR

Re: Missing MapState when Timer fires after restored state

2018-05-14 Thread Bowen Li
Hi Juho, You are right, there's no transactional guarantee on timers and state in processElement(). They may end up with inconsistency if your job was cancelled in the middle of processing an element. To avoid the situation, the best programming practice is to always check if the state you're

Re: Recommended books

2018-05-09 Thread Bowen Li
I'd recommend this book, *Stream Processing with Apache Flink: Fundamentals, Implementation, and Operation of Streaming Applications.* It's probably the most authentic book about Flink on the market. You can buy and read the early release on OReilly,

Re: 1.4.3 release/roadmap

2018-04-19 Thread Bowen Li
​to find bug fixes that are going into​ 1.4.x, say 1.4.3, you can filter jira tickets with 'Fix Versions' as '1.4.3' On Thu, Apr 19, 2018 at 1:36 AM, Daniel Harper wrote: > Hi there, > > There are some bug fixes that are in the 1.4 branch that we would like to > be made

Re: [ANNOUNCE] Apache Flink 1.4.1 released

2018-02-15 Thread Bowen Li
Congratulations everyone! On Thu, Feb 15, 2018 at 10:04 AM, Tzu-Li (Gordon) Tai wrote: > The Apache Flink community is very happy to announce the release of Apache > Flink 1.4.1, which is the first bugfix release for the Apache Flink 1.4 > series. > > > Apache Flink® is an

[SEATTLE MEETUP] Announcing First Seattle Apache Flink Meetup

2018-01-04 Thread Bowen Li
drinks will be provided. Plenty of onsite parking spots. DATE: Jan 17th, 2018, Wednesday TALKS: - Haitao Wang, Senior Staff Engineer at Alibaba, will give a presentation on large-scale streaming processing with Flink and Flink SQL at Alibaba and several internal use cases. - Bowen Li will talk

Re: Could not flush and close the file system output stream to s3a, is this fixed?

2017-12-14 Thread Bowen Li
14, 2017 at 2:05 AM, Fabian Hueske <fhue...@gmail.com> wrote: > Bowen Li (in CC) closed the issue but there is no fix (or at least it is > not linked in the JIRA). > Maybe it was resolved in another issue or can be differently resolved. > > @Bowen, can you comment on ho

Re: [DISCUSS] Dropping Scala 2.10

2017-09-20 Thread Bowen Li
+1 for dropping support for Scala 2.10 On Tue, Sep 19, 2017 at 3:29 AM, Sean Owen wrote: > For the curious, here's the overall task in Spark: > > https://issues.apache.org/jira/browse/SPARK-14220 > > and most of the code-related changes: > >

Re: Does RocksDB need a dedicated CPU?

2017-09-05 Thread Bowen Li
t; SSDs to achieve the best performance. > > Regards, > > Kien > > > > On 9/5/2017 1:15 PM, Bowen Li wrote: > >> Hi guys, >> >> Does RocksDB need a dedicated CPU? Do we need to allocate one CPU for >> each RocksDB while deploying Flink cluster wi

Does RocksDB need a dedicated CPU?

2017-09-05 Thread Bowen Li
Hi guys, Does RocksDB need a dedicated CPU? Do we need to allocate one CPU for each RocksDB while deploying Flink cluster with RocksDB state backend? I think there's probably no need since RocksDB is a native 'library', but I want to confirm it with Flink community. Thanks, Bowen

Re: Even out the number of generated windows

2017-08-28 Thread Bowen Li
gt; ProcessFunction docs: https://ci.apache.org/projects/flink/flink-docs- > release-1.3/dev/stream/process_function.html > > Best, > Aljoscha > > On 27. Aug 2017, at 19:19, Bowen Li <bowen...@offerupnow.com> wrote: > > Hi Robert, > Thank you for the suggestion, I

Re: Even out the number of generated windows

2017-08-27 Thread Bowen Li
To throttle a stream, I would recommend just doing a map operation that is > calling "Thread.sleep()" every n records. > > On Sat, Aug 26, 2017 at 4:11 AM, Bowen Li <bowen...@offerupnow.com> wrote: > >> Hi Robert, >> We use kinesis sink (FlinkKinesis

Re: Even out the number of generated windows

2017-08-25 Thread Bowen Li
e introduce > additional latency (= results come in later). > > > On Fri, Aug 25, 2017 at 6:23 AM, Bowen Li <bowen...@offerupnow.com> wrote: > >> Hi guys, >> >> I do have a question for how Flink generates windows. >> >> We are using a 1-day sized

Re: Which window function to use to start a window at anytime

2017-08-25 Thread Bowen Li
you would need a way to clean that up. You could do > this by using a ProcessFunction where you set a cleanup timer for the > per-key window-start state. > > Best, > Aljoscha > > On 16. Aug 2017, at 06:37, Bowen Li <bowen...@offerupnow.com> wrote: > > Hi guys, &g

Re: Flink doesn't free YARN slots after restarting

2017-08-25 Thread Bowen Li
emantics. > > What do you mean by burning down the underlying KPL? If KPL has a max > throughput, then the FlinkKinesisProducer should ideally respect that. > > nice ASCII art btw :-) > > Cheers, > Till > > On Fri, Aug 25, 2017 at 6:20 AM, Bowen Li <bowen...@offerupnow.

Even out the number of generated windows

2017-08-24 Thread Bowen Li
Hi guys, I do have a question for how Flink generates windows. We are using a 1-day sized sliding window with 1-hour slide to count some features of items based on event time. We have about 20million items. We observed that Flink only emit results on a fixed time in an hour (e.g. 1am, 2am, 3am,

Re: Flink doesn't free YARN slots after restarting

2017-08-24 Thread Bowen Li
de us with the debug log level logs of the > TaskManagers. > > Cheers, > Till > ​ > > On Fri, Aug 11, 2017 at 5:37 AM, Bowen Li <bowen...@offerupnow.com> wrote: > >> Hi Till, >> Any idea why it happened? I've tried different configurations for >&

Re: akka timeout

2017-08-23 Thread Bowen Li
Hi Steven, Yes, GC is a big overhead, it may cause your CPU utilization to reach 100%, and every process stopped working. We ran into this a while too. How much memory did you assign to TaskManager? How much the your CPU utilization when your taskmanager is considered 'killed'? Bowen

Re: [Survey] How many people use Flink with AWS Kinesis sink

2017-08-21 Thread Bowen Li
; Do your observations pertain to Kinesis Consumer as well, or mainly to the > Kinesis Producer? > > Best, > Stephan > > > On Mon, Aug 21, 2017 at 8:29 AM, Bowen Li <bowen...@offerupnow.com> wrote: > >> Hi guys, >> We want to have a more accurate

[Survey] How many people use Flink with AWS Kinesis sink

2017-08-21 Thread Bowen Li
Hi guys, We want to have a more accurate idea of how many people are writing Flink's computation result to AWS Kinesis, and how many people had successful Flink deployment against Kinesis? The reason I ask for the survey is because we have been trying to make our Flink jobs and Kinesis

Which window function to use to start a window at anytime

2017-08-15 Thread Bowen Li
Hi guys, We are trying use Flink to count millions of keyed items of an hour window hourly as `time(SlidingEventTimeWindows.of(1hour, 1hour))`. According to the sliding window doc , all windows are

Re: Flink doesn't free YARN slots after restarting

2017-08-10 Thread Bowen Li
On Wed, Aug 9, 2017 at 1:33 PM, Bowen Li <bowen...@offerupnow.com> wrote: > Hi Till, > Thanks for taking this issue. > > We are not comfortable sending logs to a email list which is this > open. I'll send logs to you. > > Thanks, > Bowen > > > On Wed

Re: Flink doesn't free YARN slots after restarting

2017-08-09 Thread Bowen Li
ogs with us? > > Cheers, > Till > > On Wed, Aug 9, 2017 at 9:32 AM, Bowen Li <bowen...@offerupnow.com> wrote: > >> Hi guys, >> I was running a Flink job (12 parallelism) on an EMR cluster with 48 >> YARN slots. When the job starts, I can see from Flink U

Flink doesn't free YARN slots after restarting

2017-08-09 Thread Bowen Li
Hi guys, I was running a Flink job (12 parallelism) on an EMR cluster with 48 YARN slots. When the job starts, I can see from Flink UI that the job took 12 slots, and 36 slots were left available. I would expect that when the job fails, it would restart from checkpointing by taking

Re: [POLL] Who still uses Java 7 with Flink ?

2017-07-12 Thread Bowen Li
5274018=com.atlassian.jira. > >> > > plugin.system.issuetabpanels:comment-tabpanel#comment-15274018 > >> > > > > > >> > > > > > >> > > > > > >> > > > > On Thu, Mar 23, 2017 at 2:42 PM, Theodore Vasiloudis < &g

Re: confusing RocksDBStateBackend parameters

2017-06-18 Thread Bowen Li
Hope that helps. > > Best > Ziyad > > Best Regards > *Ziyad Muhammed Mohiyudheen * > 407, Internationales Studienzentrum Berlin > Theodor-Heuss-Platz 5 > 14052 Berlin > *Ph: +49 176 6587 3343 <%2B49%20176%206587%203343>* > *Mail to*: *mmzi...@gmail.com <mmzi

confusing RocksDBStateBackend parameters

2017-06-16 Thread Bowen Li
Hello guys, I've been trying to figure out differences among several parameters of RocksDBStateBackend. The confusing parameters are: In flink-conf.yaml: 1. state.backend.fs.checkpointdir 2. state.backend.rocksdb.checkpointdir 3. state.checkpoints.dir and

Re: Clarification on state backend parameters

2017-06-14 Thread Bowen Li
FYI, http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Clarification-on-state-backend-parameters-td11419.html here's the context that discussed differences among: state.backend.fs.checkpointdir state.backend.rocksdb.checkpointdir state.checkpoints.dir On Wed, Jun 14, 2017 at

Re: [POLL] Who still uses Java 7 with Flink ?

2017-03-16 Thread Bowen Li
There's always a tradeoff we need to make. I'm in favor of upgrading to Java 8 to bring in all new Java features. The common way I've seen (and I agree) other software upgrading major things like this is 1) upgrade for next big release without backward compatibility and notify everyone 2)