Re: [ANNOUNCE] Progress of Apache Flink 1.10 #2

2019-11-01 Thread Thomas Weise
Is there any activity on FLIP-27 that would make it relevant for 1.10 release? Thanks Gary for the update, it provides excellent visibility on current activity and what we can expect with the release. On Fri, Nov 1, 2019 at 1:52 PM Steven Wu wrote: > Gary, FLIP-27 seems to get omitted in the

Re: Flink SQL + savepoint

2019-11-01 Thread Fanbin Bu
Kurt, What do you recommend for Flink SQL to use savepoints? On Thu, Oct 31, 2019 at 12:03 AM Yun Tang wrote: > Hi Fanbin > > > > If you do not change the parallelism or add and remove operators, you > could still use savepoint to resume your jobs with Flink SQL. > > > > However, as far as I

Re ordering events with flink

2019-11-01 Thread Vishwas Siravara
Hi guys, I want to know if it's possible to sort events in a flink data stream. I know I can't sort a stream but is there a way in which I can buffer for a very short time and sort those events before sending it to a data sink. In our scenario we consume from a kafka topic which has multiple

Re: [ANNOUNCE] Progress of Apache Flink 1.10 #2

2019-11-01 Thread Steven Wu
Gary, FLIP-27 seems to get omitted in the 2nd update. below is the info from update #1. - FLIP-27: Refactor Source Interface [20] - FLIP accepted. Implementation is in progress. On Fri, Nov 1, 2019 at 7:01 AM Gary Yao wrote: > Hi community, > > Because we have approximately one month

Re: is Streaming Ledger open source?

2019-11-01 Thread kanth909
Got it! So will go with spark delta. Sent from my iPhone > On Nov 1, 2019, at 6:23 AM, Seth Wiesman wrote: > > Hi Kant, > > Streaming Ledger is actively maintained but is not open source. That repo > contains the sdk which is open source along with a single threaded runner for > testing.

Re: No FileSystem for scheme "file" for S3A in and state processor api in 1.9

2019-11-01 Thread spoganshev
No, I didn't because it's inconvenient for us to have 2 different docker images for streaming and batch jobs. -- Sent from: http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/

Re: is Streaming Ledger open source?

2019-11-01 Thread kanth909
Got it! Sent from my iPhone > On Nov 1, 2019, at 6:23 AM, Seth Wiesman wrote: > > Hi Kant, > > Streaming Ledger is actively maintained but is not open source. That repo > contains the sdk which is open source along with a single threaded runner for > testing. The parallel execution engine

Re: Preserving (best effort) messages order between operators

2019-11-01 Thread Averell
Hi Yun, I found the cause of the issue. That ContinuousFileReaderOperator (my operator B) is using a PriorityQueue which maintains a buffer sorted by modTime, thus my records were re-ordered. I don't understand the reason behind using PriorityQueue instead of an ordinary Queue though. Thanks.

Re: low performance in running queries

2019-11-01 Thread Piotr Nowojski
Hi, More important would be the code profiling output. I think VisualVM allows to share the code profiling result as “snapshots”? If you could analyse or share this, it would be helpful. From the attached screenshot the only thing that is visible is that there are no GC issues, and secondly

Re: low performance in running queries

2019-11-01 Thread Habib Mostafaei
Hi Piotrek, Thanks for the list of profilers. I used VisualVM and here is the resource usage for taskManager. Habib On 11/1/2019 9:48 AM, Piotr Nowojski wrote: Hi, >  Is there a simple way to get profiling information in Flink? Flink doesn’t provide any special tooling for that. Just use

[ANNOUNCE] Progress of Apache Flink 1.10 #2

2019-11-01 Thread Gary Yao
Hi community, Because we have approximately one month of development time left until the targeted Flink 1.10 feature freeze, we thought now would be a good time to give another progress update. Below we have included a list of the ongoing efforts that have made progress since our last release

????????????Flink trigger????????????????????

2019-11-01 Thread Jun Zhang
??Evictor?? BestJun -- -- ??: Qi Kang

Re: Stateful functions presentation code (UI part)

2019-11-01 Thread Flavio Pompermaier
Thanks Igal for the detailed explanantion. I know that this was only a demo, I just wanted to reason a bit on the pros and cons of sending data to an UI from Flink. Best, Flavio Il Ven 1 Nov 2019, 12:21 Igal Shilman ha scritto: > Hi Flavio, let me try to clarify: > > The intention of this

Re: Flink 1.5+ performance in a Java standalone environment

2019-11-01 Thread Jakub Danilewicz
Thanks for your reply, Till. As mentioned above I execute graph processing in a straight-ahead Java standalone environment (no cluster underneath, no specific configuration except for parallelism), just as if you simply ran the Java class I pasted upthread with a Flink distribution JAR (plus

Re: How to emit changed data only w/ Flink trigger?

2019-11-01 Thread Taher Koitawala
You can do this by writing a custom trigger or evictor. On Fri, Nov 1, 2019 at 3:08 PM Qi Kang wrote: > Hi all, > > > We have a Flink job which aggregates sales volume and GMV data of each > site on a daily basis. The code skeleton is shown as follows. > > > ``` > sourceStream > .map(message

Re: Async operator with a KeyedStream

2019-11-01 Thread vino yang
Hi Bastien, Your analysis of using KeyedStream in Async I/O is correct. It will not figure out the key. In your scene, the good practice about interacting with DB is async I/O + thread pool[1] + connection Pool. You can use a connection pool to reuse and limit the mysql connection. Best, Vino

Re: How to emit changed data only w/ Flink trigger?

2019-11-01 Thread kant kodali
I am new to Flink so I am not sure if I am giving you the correct answer so you might want to wait for others to respond. But I think you should do .inUpsertMode() On Fri, Nov 1, 2019 at 2:38 AM Qi Kang wrote: > Hi all, > > > We have a Flink job which aggregates sales volume and GMV data of

Re: Stateful functions presentation code (UI part)

2019-11-01 Thread Igal Shilman
Hi Flavio, let me try to clarify: The intention of this example is to demonstrate how different entities (drivers, passengers, etc') participates in a protocol (ride matching). For that we have the stateful functions application, and a standalone java application that just generates the events to

Re: Flink 1.5+ performance in a Java standalone environment

2019-11-01 Thread Till Rohrmann
Hi Jakub, what are the cluster settings and the exact job settings you are running your job with? I'm asking because one difference between legacy and FLIP-6 mode is that the legacy mode spreads out tasks across all available TaskManagers whereas the FLIP-6 mode tries to bin package them on as

is Streaming Ledger open source?

2019-11-01 Thread kant kodali
Hi All, Is https://github.com/dataArtisans/da-streamingledger an open-source project? Looks to me that this project is not actively maintained. is that correct? since the last commit is one year ago and it shows there are 0 contributors? Thanks!

How to emit changed data only w/ Flink trigger?

2019-11-01 Thread Qi Kang
Hi all, We have a Flink job which aggregates sales volume and GMV data of each site on a daily basis. The code skeleton is shown as follows. ``` sourceStream .map(message -> JSON.parseObject(message, OrderDetail.class)) .keyby("siteId")

Re: No FileSystem for scheme "file" for S3A in and state processor api in 1.9

2019-11-01 Thread Piotr Nowojski
Ok, thanks for the explanation now it makes sense. Previously I haven’t noticed that those snapshot state calls visible in your stack trace come from State Processor API. We will try to reproduce it, so we might have more questions later, but those information might be enough. One more

Re: low performance in running queries

2019-11-01 Thread Piotr Nowojski
Hi, > Is there a simple way to get profiling information in Flink? Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM

Re: low performance in running queries

2019-11-01 Thread Habib Mostafaei
I used streaming WordCount provided by Flink and the file contains text like "This is some text...". I just copied several times. Best, Habib On 11/1/2019 6:03 AM, Zhenghua Gao wrote: 2019-10-30 15:59:52,122 INFO org.apache.flink.runtime.taskmanager.Task - Split Reader:

如何让Flink trigger只输出有变化的数据?

2019-11-01 Thread Qi Kang
Hi, 我们有一个按自然天聚合统计各站点销量和GMV数据的Flink任务,代码框架如下: ``` sourceStream .map(message -> JSON.parseObject(message, OrderDetail.class)) .keyby("siteId") .window(TumblingProcessingTimeWindows.of(Time.days(1), Time.hours(-8))) .trigger(ContinuousProcessingTimeTrigger.of(Time.seconds(1)))

Re: The RMClient's and YarnResourceManagers internal state about the number of pending container requests has diverged

2019-11-01 Thread Till Rohrmann
Hi Regina, at the moment the community works towards the 1.10 release with a lot of features trying to be completed. The intended feature freeze is end of November. Due to this it is quite hard to tell when exactly this problem will be properly fixed but we'll try our best. Cheers, Till On Thu,

Re: RemoteEnvironment cannot execute job from local.

2019-11-01 Thread Till Rohrmann
No it is the expected behaviour. As I've said, you should give the createRemoteEnvironment the user code jar of your program. Otherwise Flink cannot find your filter function. Hence, it works if you comment it out because it is not needed. Cheers, Till On Thu, Oct 31, 2019 at 11:41 AM Simon Su

Re: Flink 1.9 Sql Rowtime Error

2019-11-01 Thread OpenInx
Hi Polarisary. Checked the flink codebase and your stacktraces, seems you need to format the timestamp as : "-MM-dd'T'HH:mm:ss.SSS'Z'" The code is here:

Flink 1.9 Sql Rowtime Error

2019-11-01 Thread Polarisary
Hi All: I have define kafka connector Descriptor, and registe Table tEnv.connect(new Kafka() .version("universal") .topic(tableName) .startFromEarliest() .property("zookeeper.connect", “xxx") .property("bootstrap.servers", “xxx")

Re: [DISCUSS] Semantic and implementation of per-job mode

2019-11-01 Thread tison
Hi all, Thanks for your participation! First of all I have to clarify two confusion in this thread. 1. The proposed "pre-program" mode is definitely a new mode opt-in. It is described in "Compatibility" section of the original email. 2. The documentation linked in the original email "Flink