Re: Does Flink support Hadoop (HDFS) 2.9 ?

2018-03-01 Thread Piotr Nowojski
Hi, You can build Flink against Hadoop 2.9: https://issues.apache.org/jira/browse/FLINK-8177 It seems like convenience binaries will be built by us only since 1.5: https://issues.apache.org/jira/browse/FLINK-8363

Re: Heap Problem with Checkpoints

2018-06-20 Thread Piotr Nowojski
ian Wollert > Zalando SE > > E-Mail: fabian.woll...@zalando.de > <mailto:fabian.woll...@zalando.de> > > Am Di., 19. Juni 2018 um 11:55 Uhr schrieb Piotr Nowojski > mailto:pi...@data-artisans.com>>: > Hi, > > Can you search the logs/std err/std output for

Re: Always trigger calculation of a tumble window in Flink SQL

2018-11-06 Thread Piotr Nowojski
Hi, You might come up with some magical self join that could do the trick - join/window join the the aggregation result with self and then aggregate it again. I don’t know if that’s possible (probably you would need to write custom aggregate function) and would be inefficient. It will be

Re: Understanding checkpoint behavior

2018-11-06 Thread Piotr Nowojski
Hi, Checkpoint duration sync, that’s only the time taken for the “synchronous” part of taking a snapshot of your operator. Your 11m time probably comes from the fact that before this snapshot, checkpoint barrier was stuck somewhere in your pipeline for that amount of time processing some

Re: Always trigger calculation of a tumble window in Flink SQL

2018-11-08 Thread Piotr Nowojski
ster a timer when no elements received during a time > window? > My requirement is to always fire at end of the time window even no result > from the sql query. > On 7 Nov 2018, at 09:48, Piotr Nowojski wrote: > > Hi, > > You would have to register timers (probably based o

Re: Understanding checkpoint behavior

2018-11-08 Thread Piotr Nowojski
Hi, > On 6 Nov 2018, at 18:22, PranjalChauhan wrote: > > Thank you for your response Piotr. I plan to upgrade to Flink 1.5.x early > next year. > > Two follow-up questions for now. > > 1. > " When operator snapshots are taken, there are two parts: the synchronous > and the asynchronous

Re: Always trigger calculation of a tumble window in Flink SQL

2018-11-07 Thread Piotr Nowojski
Hi, You would have to register timers (probably based on event time). Your operator would be a vastly simplified window operator, where for given window you keep emitted record from your SQL, sth like: MapState emittedRecords; // map window start -> emitted record When you process elements,

Re: Live configuration change

2018-11-06 Thread Piotr Nowojski
Hi, Sorry but none that I’m aware of. As far as I know, the only way to dynamically configure Kafka source would be for you to copy and modify it’s code. Piotrek > On 6 Nov 2018, at 15:19, Ning Shi wrote: > > In the job I'm implementing, there are a couple of configuration > variables that I

Re: JobManager did not respond within 60000 ms

2018-10-09 Thread Piotr Nowojski
Hi, You have quite complicated job graph and very low memory settings for the job manager and task manager. It might be that long GC pauses are causing this problem. Secondly, there are quite some results in google search

Re: Watermark through Rest Api

2018-10-09 Thread Piotr Nowojski
Hi, Watermarks are tracked per Task/Operator level: https://ci.apache.org/projects/flink/flink-docs-stable/monitoring/metrics.html#io Tracking watermarks on the job level would be problematic, since it would

Re: Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager

2018-10-10 Thread Piotr Nowojski
Hi, I don’t think that’s exposed on the TaskManager. Maybe it would simplify things a bit if you implement this as a single “JobManager” health check, not multiple TaskManagers health check - for example verify that there are expected number of registered TaskManagers. It might cover your

Re: Flink leaves a lot RocksDB sst files in tmp directory

2018-10-10 Thread Piotr Nowojski
Hi, Was this happening in older Flink version? Could you post in what circumstances the job has been moved to a new TM (full job manager logs and task manager logs would be helpful)? I’m suspecting that those leftover files might have something to do with local recovery. Piotrek > On 9 Oct

Re: JobManager did not respond within 60000 ms

2018-10-10 Thread Piotr Nowojski
; in one). > > Thanks again Piotrek ! > > Julien. > > - Mail original - > De: "Piotr Nowojski" > À: jpreis...@free.fr > Cc: user@flink.apache.org > Envoyé: Mardi 9 Octobre 2018 10:37:58 > Objet: Re: JobManager did not respond within 6 ms >

Re: Wondering how to check if taskmanager is registered with the jobmanager, without asking job manager

2018-10-10 Thread Piotr Nowojski
y task managers > should be there. Bit more logic, but doable. Thnx for the tip. > > Cheers, > Barisa > > On Wed, 10 Oct 2018 at 09:05, Piotr Nowojski <mailto:pi...@data-artisans.com>> wrote: > Hi, > > I don’t think that’s exposed on the TaskManager. > &

Re: How does flink read data from kafka number of TM's are more than topic partitions

2018-09-21 Thread Piotr Nowojski
Hi, Yes, in your case half of the Kafka source tasks wouldn’t read/process any records (you can check that in web UI). This shouldn’t harm you, unless your records will be redistributed after the source. For example: source.keyBy(..).process(new MyVeryHeavyOperator()).print() Should be fine,

Re: How does flink read data from kafka number of TM's are more than topic partitions

2018-09-21 Thread Piotr Nowojski
+91 8407979163 > > > On Fri, Sep 21, 2018 at 6:26 PM Piotr Nowojski <mailto:pi...@data-artisans.com>> wrote: > Hi, > > Yes, in your case half of the Kafka source tasks wouldn’t read/process any > records (you can check that in web UI). This shouldn’t harm you, u

Re: Between Checkpoints in Kafka 11

2018-09-24 Thread Piotr Nowojski
Hi, I have nothing more to add. You (Dawid) and Vino explained it correctly :) Piotrek > On 24 Sep 2018, at 15:16, Dawid Wysakowicz wrote: > > Hi Harshvardhan, > > Flink won't buffer all the events between checkpoints. Flink uses Kafka's > transaction, which are committed only on

Re: How to migrate Kafka Producer ?

2019-01-02 Thread Piotr Nowojski
Hi Edward, Sorry for coming back so late (because of holiday season). You are unfortunately right. Our FlinkKafkaProducer should have been upgrade-able, but it is not. I have created a bug for this [1]. For the time being, until we fix the issue, you should be able to stick to 0.11 producer

Re: NPE when using spring bean in custom input format

2019-01-21 Thread Piotr Nowojski
Hi, You have to use `open()` method to handle initialisation of the things required by your code/operators. By the nature of the LocalEnvironment, the life cycle of the operators is different there compared to what happens when submitting a job to the real cluster. With remote environments

Re: Query on retract stream

2019-01-21 Thread Piotr Nowojski
Hi, There is a missing feature in Flink Table API/SQL of supporting retraction streams as the input (or conversions from append stream to retraction stream) at the moment. With that your problem would simplify to one simple `SELECT uid, count(*) FROM Changelog GROUP BY uid`. There is an

Re: Query on retract stream

2019-01-21 Thread Piotr Nowojski
. > And then you just need to analyze the events in this window. > > Piotr Nowojski mailto:pi...@da-platform.com>> > 于2019年1月21日周一 下午8:44写道: > Hi, > > There is a missing feature in Flink Table API/SQL of supporting retraction > streams as the input (or conv

Re: runtime.resourcemanager

2018-12-11 Thread Piotr Nowojski
00). > 2018-12-10 12:20:22,409 INFO org.apache.flink.runtime.filecache.FileCache > - User file cache uses directory > /tmp/flink-dist-cache-058052c5-36cc-432f-88eb-8acf7dc5f1f1 > 2018-12-10 12:20:22,743 INFO > org.apache.flink.runtime.taskexecutor.TaskExe

Re: Failed to resume job from checkpoint

2018-12-07 Thread Piotr Nowojski
Adding back user mailing list. Andrey, could you take a look at this? Piotrek > On 7 Dec 2018, at 12:28, Ben Yan wrote: > > Yes. Previous versions never happened > > Piotr Nowojski mailto:pi...@data-artisans.com>> > 于2018年12月7日周五 下午7:27写道: > Hey, > > Do

Re: Use event time

2018-12-07 Thread Piotr Nowojski
this mean that the event time only impacts on the event selection for a > time window? > > Without use of a time window, the event time has no impact on the order of > any records/events? > > Is my understanding correct? > > Thank you very much for your help.

Re: Failed to resume job from checkpoint

2018-12-07 Thread Piotr Nowojski
Hey, Do you mean that the problem started occurring only after upgrading to Flink 1.7.0? Piotrek > On 7 Dec 2018, at 11:28, Ben Yan wrote: > > hi . I am using flink-1.7.0. I am using RockDB and hdfs as statebackend, but > recently I found the following exception when the job resumed from

Re: runtime.resourcemanager

2018-12-07 Thread Piotr Nowojski
Hi, Please investigate logs/standard output/error from the task manager that has failed (the logs that you showed are from job manager). Probably there is some obvious error/exception explaining why has it failed. Most common reasons: - out of memory - long GC pause - seg fault or other error

Re: A question on the Flink "rolling" FoldFunction

2018-12-07 Thread Piotr Nowojski
Hi Min, Please feel welcomed in the Flink community. One small remark, dev mailing list is for developers of Flink and all of the issues/discussions that arise in the process (discussing how to implement new feature etc), so user mailing list is the right one to ask questions about using Flink

Re: Use event time

2018-12-07 Thread Piotr Nowojski
Hi again! Flink doesn’t order/sort the records according to event time. The preveiling idea is: - records will be arriving out of order, operators should handle that - watermarks are used for indicators of the current lower bound of the event time “clock” For examples windowed

Re: Flink with Docker: docker-compose and FLINK_JOB_ARGUMENT exception

2018-12-07 Thread Piotr Nowojski
Hi, I have never used flink and docker together, so I’m not sure if I will be able to help, however have you seen this README: https://github.com/apache/flink/tree/master/flink-container/docker ? Shouldn’t you be passing your arguments via `FLINK_JOB_ARGUMENTS` environment variable? Piotrek

Re: Dulicated messages in kafka sink topic using flink cancel-with-savepoint operation

2018-12-03 Thread Piotr Nowojski
kafka consumer reads the messages from this sink, > the duplicated messages have not been read so everything is OK. > > > > Kind regards, > Nastaran Motavalli > > > From: Piotr Nowojski > Sent: Thursday, November 29, 2018 3:38:38 PM > To: Nastaran Motavali > Cc

Re: runtime.resourcemanager

2018-12-10 Thread Piotr Nowojski
ot > start at all. The whole time just the job manager is struggling... For very > very toy examples, after a long time (during this time I see the job manager > logs as I mentioned before), the job is started and can be executed in 2 > seconds. > > Best, > > Alieh > >

Re: Dulicated messages in kafka sink topic using flink cancel-with-savepoint operation

2018-11-29 Thread Piotr Nowojski
Hi Nastaran, When you are checking for duplicated messages, are you reading from kafka using `read_commited` mode (this is not the default value)? https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/kafka.html#kafka-producer-partitioning-scheme > Semantic.EXACTLY_ONCE: uses

Re: [flink-cep] Flick CEP support for the group By operator

2018-11-23 Thread Piotr Nowojski
Hi, Yes, sure. Just use CEP on top of KeyedStream. Take a look at `keyBy`: https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/ Piotrek > On 23 Nov 2018, at 16:04, Spico Florin wrote: > > Hello! > > I'm using Flink 1.4.2 and I would like to use a group by

Re: Call batch job in streaming context?

2018-11-23 Thread Piotr Nowojski
Hi, I’m not sure if I understand your problem and your context, but spawning a batch job every 45 seconds doesn’t sound as a that bad idea (as long as the job is short). Another idea would be to incorporate this batch job inside your streaming job, for example by reading from Cassandra using

Re: error while joining two datastream

2018-11-23 Thread Piotr Nowojski
Hi, I assume that withTimestampsAndWatermarks1.print(); withTimestampsAndWatermarks2.print(); Actually prints what you have expected? If so, the problem might be that: a) time/watermarks are not progressing (watermarks are triggering the output of your

Re: [flink-cep] Flick CEP support for the group By operator

2018-11-25 Thread Piotr Nowojski
e/dev/stream/operators/windows.html Piotrek > On 23 Nov 2018, at 16:32, Piotr Nowojski wrote: > > Hi, > > Yes, sure. Just use CEP on top of KeyedStream. Take a look at `keyBy`: > > https://ci.apache.org/projects/flink/flink-docs-release-1.6/dev/stream/operators/ > >

Re: ***UNCHECKED*** Re: Standalone cluster instability

2018-09-19 Thread Piotr Nowojski
t; task manager process when it discovers that it has been lost? It reports that > there are no active task managers and available slots are 0. We're running on > flink version 1.4.2. > > I've attached the syslog and jobmanager log, the crash happened at Sep 18 > 23:31:14. > &

Re: Sampling rate higher than 1Khz

2019-01-28 Thread Piotr Nowojski
Hi, Maybe stupid idea, but does anything prevents a user from pretending that watermarks/event times are in different unit, for example microseconds? Of course assuming using row/event time and not using processing time for anything? Piotrek > On 28 Jan 2019, at 14:58, Tzu-Li (Gordon) Tai

Re: Setting source vs sink vs window parallelism with data increase

2019-03-27 Thread Piotr Nowojski
r help again. > > On Wed, Mar 6, 2019 at 9:44 PM Padarn Wilson <mailto:pad...@gmail.com>> wrote: > Thanks a lot for your suggestion. I’ll dig into it and update for the mailing > list if I find anything useful. > > Padarn > > On Wed, 6 Mar 2019 at 6:03 PM, Piotr

Re: Streaming Protobuf into Parquet file not working with StreamingFileSink

2019-03-27 Thread Piotr Nowojski
ation that could be helpful. > > Would appreciate your help. > > Thanks, > Rafi > > > On Thu, Mar 21, 2019 at 5:20 PM Kostas Kloudas <mailto:kklou...@gmail.com>> wrote: > Hi Rafi, > > Piotr is correct. In-progress files are not necessarily readable.

Re: StochasticOutlierSelection

2019-03-21 Thread Piotr Nowojski
va, I chose the MOA library in combination with flink API for > anomaly detection streaming which gives quite satisfactory results. > > Best, > > Anissa > > > > Le lun. 4 mars 2019 à 16:08, Piotr Nowojski <mailto:pi...@ververica.com>> a écrit : > >

Re: End-to-end exactly-once semantics in simple streaming app

2019-04-08 Thread Piotr Nowojski
Hi, Regarding the JDBC and Two-Phase commit (2PC) protocol. As Fabian mentioned it is not supported by the JDBC standard out of the box. With some workarounds I guess you could make it work by for example following one of the ideas: 1. Write records using JDBC with at-least-once semantics, by

Re: Flink Custom SourceFunction and SinkFunction

2019-03-04 Thread Piotr Nowojski
Hi, I couldn’t find any references to your question neither I haven’t seen such use case, but: Re 1. It looks like it could work Re 2. It should work as well, but just try to use StreamingFileSink Re 3. For custom source/sink function, if you do not care data processing guarantees it’s

Re: Setting source vs sink vs window parallelism with data increase

2019-03-04 Thread Piotr Nowojski
Hi, What Flink version are you using? Generally speaking Flink might not the best if you have records fan out, this may significantly increase checkpointing time. However you might want to first identify what’s causing long GC times. If there are long GC pause, this should be the first thing

Re: Task slot sharing: force reallocation

2019-03-04 Thread Piotr Nowojski
Hi, Are you asking the question if that’s the behaviour or you have actually observed this issue? I’m not entirely sure, but I would guess that the Sink tasks would be distributed randomly across the cluster, but maybe I’m mixing this issue with resource allocations for Task Managers. Maybe

Re: Command exited with status 1 in running Flink on marathon

2019-03-04 Thread Piotr Nowojski
Hi, With just this information it might be difficult to help. Please look for some additional logs (has the Flink managed to log anything?) or some standard output/errors. I would guess this might be some relatively simple mistake in configuration, like file/directory read/write/execute

Re: [Flink-Question] In Flink parallel computing, how do different windows receive the data of their own partition, that is, how does Window determine which partition number the current Window belongs

2019-03-04 Thread Piotr Nowojski
Hi, I’m not if I understand your question/concerns. As Rong Rong explained, key selector is used to assign records to window operators. Within key context, you do not have access to other keys/values in your operator/functions, so your reduce/process/… functions when processing key:1 won’t

Re: StochasticOutlierSelection

2019-03-04 Thread Piotr Nowojski
Hi, I have never used this code, but ml library depends heavily on Scala, so I wouldn’t recommend using it with Java. However if you want to go this way (I’m not sure if that’s possible), you would have to pass the implicit parameters manually somehow (I don’t know how to do that from

Re: Flink parallel subtask affinity of taskmanager

2019-03-04 Thread Piotr Nowojski
Hi, You should be able to use legacy mode for this: https://ci.apache.org/projects/flink/flink-docs-stable/ops/config.html#legacy However note that this option will disappear in the near future and there is a JIRA ticket to address this issue: https://issues.apache.org/jira/browse/FLINK-11815

Re: event time timezone is not correct

2019-03-04 Thread Piotr Nowojski
Hi, I think that Flink SQL works currently only in UTC, so the 8 hours difference is a result of you using GMT+8 time stamps somewhere. Please take a look at this thread: http://mail-archives.apache.org/mod_mbox/flink-user/201711.mbox/%3c2e1eb190-26a0-b288-39a4-683b463f4...@apache.org%3E I

Re: Setting source vs sink vs window parallelism with data increase

2019-03-06 Thread Piotr Nowojski
s > a problem, or > 2) Is is the shuffle of all the records after the expansion which is taking a > large time - if so, is there anything I can do to mitigate this other than > trying to ensure less shuffle. > > Thanks, > Padarn > > > On Tue, Mar 5, 2019 at 7:0

Re: event time timezone is not correct

2019-03-05 Thread Piotr Nowojski
n > >> 在 2019年3月4日,下午11:29,Piotr Nowojski > <mailto:pi...@ververica.com>> 写道: >> >> Hi, >> >> I think that Flink SQL works currently only in UTC, so the 8 hours >> difference is a result of you using GMT+8 time stamps somewhere. Please take >&g

Re: Command exited with status 1 in running Flink on marathon

2019-03-05 Thread Piotr Nowojski
.org/projects/flink/flink-docs-stable/ops/deployment/mesos.html> > > But I did not install Hadoop. Is problem for that? Since HDFS was commented. > I did not change it. > > On Mon, Mar 4, 2019 at 4:40 PM Piotr Nowojski <mailto:pi...@ververica.com>> wrote: > Hi, &

Re: Task slot sharing: force reallocation

2019-03-05 Thread Piotr Nowojski
g map 1 of job 2 on machine 2, map 1 of job 3 on machine 3 so we end up > with sinks sit evenly throughout the cluster). > > Thanks. > > Le > > On Mon, Mar 4, 2019 at 6:49 AM Piotr Nowojski <mailto:pi...@ververica.com>> wrote: > Hi, > > Are you asking t

Re: How to join stream and dimension data in Flink?

2019-03-14 Thread Piotr Nowojski
unt * r.rate > FROM > Orders AS o > LEFT JOIN LatestRates FOR SYSTEM_TIME AS OF o.proctime AS r > ON r.currency = o.currency > > CC @Piotr Nowojski <mailto:pi...@data-artisans.com> Would be great to have > your opinions here. > > Best, > Hequn > > [1] &g

Re: [VOTE] Release 1.8.0, release candidate #3

2019-03-20 Thread Piotr Nowojski
ttp://codespeed.dak8s.net:8000/timeline/?ben=tumblingWindow=2> Piotr Nowojski > On 20 Mar 2019, at 10:09, Aljoscha Krettek wrote: > > Thanks Jincheng! It would be very good to fix those but as you said, I would > say they are not blockers. > >> On 20. Mar 2019, at 09:47, Kurt Young

Re: Streaming Protobuf into Parquet file not working with StreamingFileSink

2019-03-21 Thread Piotr Nowojski
Hi, I’m not sure, but shouldn’t you be just reading committed files and ignore in-progress? Maybe Kostas could add more insight to this topic. Piotr Nowojski > On 20 Mar 2019, at 12:23, Rafi Aroch wrote: > > Hi, > > I'm trying to stream events in Prorobuf format into a pa

Re: Best practice to handle update messages in stream

2019-03-21 Thread Piotr Nowojski
anyway (like join). Piotr Nowojski [1] https://issues.apache.org/jira/browse/FLINK-8545 <https://issues.apache.org/jira/browse/FLINK-8545> > On 21 Mar 2019, at 09:39, 徐涛 wrote: > > Hi Experts, > Assuming there is a stream which content is like this: >Seq

Re: Default Kafka producers pool size for FlinkKafkaProducer.Semantic.EXACTLY_ONCE

2019-04-11 Thread Piotr Nowojski
Hi Min and Fabian, The pool size is independent of the parallelism, task slots count or task managers count. The only thing that you should consider is how many simultaneous checkpoints you might have in your setup. As Fabian wrote, with >

Re: How to trigger the window function even there's no message input in this window?

2019-06-17 Thread Piotr Nowojski
Hi, As far as I know, this is currently impossible. You can workaround this issue by maybe implementing your own custom post processing operator/flatMap function, that would: - track the output of window operator - register processing time timer with some desired timeout - every time the

Re: Best practice to process DB stored log (is Flink the right choice?)

2019-06-17 Thread Piotr Nowojski
Hi, Those are good questions. > A datastream to connect to a table is available? I What table, what database system do you mean? You can check the list of existing connectors provided by Flink in the documentation. About reading from relational DB (example by using JDBC) you can read a little

Re: Error while using session window

2019-06-17 Thread Piotr Nowojski
Hi, Thanks for reporting the issue. I think this might be caused by System.currentTimeMillis() not being monotonic [1] and the fact Flink is accessing this function per element multiple times (at least twice: first for creating a window, second to perform the check that has failed in your

Re: Timeout about local test case

2019-06-17 Thread Piotr Nowojski
Hi, I don’t know what’s the reason (also there are no open issues with this test in our jira). This test seems to be working on travis/CI and it works for me when I’ve just tried running it locally. There might be some bug in the test/production that is triggered only in some specific

Re: Why the current time of my window never reaches the end time when I use KeyedProcessFunction?

2019-06-18 Thread Piotr Nowojski
Hi, Isn’t your problem that the source is constantly emitting the data and bumping your timers? Keep in mind that the code that you are basing on has the following characteristic: > In the following example a KeyedProcessFunction maintains counts per key, and > emits a key/count pair whenever

Re: CoProcessFunction vs Temporal Table to enrich multiple streams with a slow stream

2019-05-14 Thread Piotr Nowojski
Hi, Sorry for late response, somehow I wasn’t notified about your e-mail. > > So you meant implementation in DataStreamAPI with cutting corners would, > generally, shorter than Table Join. I thought that using Tables would be > more intuitive and shorter, hence my initial question :) It

Re: State migration into multiple operators

2019-05-14 Thread Piotr Nowojski
Hi, Currently there is no native Flink support for modifying the state in a such manner. However there is an on-going effort [1] and a third party project [2] to address exactly this. Both allows you you to read savepoint, modify it and write back the new modified savepoint from which you can

Re: Possilby very slow keyBy in program with no parallelism

2019-05-21 Thread Piotr Nowojski
Hi Theo, Regarding the performance issue. > None of my machine resources is fully utilized, i.e. none of the cluster CPU > runs at 100% utilization (according to htop). And the memory is virtually > available, but the RES column in htop states the processes uses 5499MB. By nature of stream

Re: update the existing Keyed value state

2019-05-03 Thread Piotr Nowojski
m/king/bravo <https://github.com/king/bravo> Piotr Nowojski > On 3 May 2019, at 11:14, Selvaraj chennappan > wrote: > > Hi Users, > We want to have a real time aggregation (KPI) . > we are maintaining aggregation counters in the keyed value state . > key could be

Re: RocksDB native checkpoint time

2019-05-03 Thread Piotr Nowojski
Hi Gyula, Have you read our tuning guide? https://ci.apache.org/projects/flink/flink-docs-stable/ops/state/large_state_tuning.html#tuning-rocksdb Synchronous part is mostly about flushing

Re: DateTimeBucketAssigner using Element Timestamp

2019-05-03 Thread Piotr Nowojski
Hi Peter, It sounds like this should work, however my question would be do you want exactly-once processing? If yes, then you would have to somehow know which exact events needs re-processing or deduplicate them somehow. Keep in mind that in case of an outage in the original job, you probably

Re: CoProcessFunction vs Temporal Table to enrich multiple streams with a slow stream

2019-05-03 Thread Piotr Nowojski
Hi Averell, I will be referring to your original two options: 1 (duplicating stream_C) and 2 (multiplexing stream_A and stream_B). Both of them could be expressed using Temporal Table Join. You could multiplex stream_A and stream_B in Table API, temporal table join them with stream_C and then

Re: Flink Kafka ordered offset commit & unordered processing

2019-07-03 Thread Piotr Nowojski
Hi, > Will Flink able to recover under this scenario? I’m not sure exactly what you mean. Flink will be able to restore the state to the last successful checkpoint, and it well could be that the some records after this initial “stuck record” were processed and emitted down the stream. In

Re: Flink Kafka ordered offset commit & unordered processing

2019-07-02 Thread Piotr Nowojski
Hi, If your async operations are stalled, this will eventually cause problems. Either this will back pressure sources (the async’s operator queue will become full) or you will run out of memory (if you configured the queue’s capacity too high). I think the only possible solution is to either

Re: Kafka ProducerFencedException after checkpointing

2019-08-12 Thread Piotr Nowojski
ast >> external checkpoint is committed. >> Now that I have 15min for Producer's transaction timeout and 10min for >> Flink's checkpoint interval, and every checkpoint takes less than 5 minutes, >> everything is working fine. >> Am I right? >> >> Anyway

Re: Kafka ProducerFencedException after checkpointing

2019-08-12 Thread Piotr Nowojski
ccurred. > > Best, > Tony Wei > > Piotr Nowojski mailto:pi...@data-artisans.com>> 於 > 2019年8月12日 週一 下午3:27寫道: > Hi, > > Yes, if it’s due to transaction timeout you will lose the data. > > Whether can you fallback to at least once, that depends on Kafka,

Re: Checkpoints very slow with high backpressure

2019-07-31 Thread Piotr Nowojski
Hi, For Flink 1.8 (and 1.9) the only thing that you can do, is to try to limit amount of data buffered between the nodes (check Flink network configuration [1] for number of buffers and or buffer pool sizes). This can reduce maximal throughput (but only if the network transfer is a significant

Re: [ANNOUNCE] Hequn becomes a Flink committer

2019-08-07 Thread Piotr Nowojski
Congratulations :) > On 7 Aug 2019, at 12:09, JingsongLee wrote: > > Congrats Hequn! > > Best, > Jingsong Lee > > -- > From:Biao Liu > Send Time:2019年8月7日(星期三) 12:05 > To:Zhu Zhu > Cc:Zili Chen ; Jeff Zhang ; Paul Lam > ;

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Piotr Nowojski
will carry its schema info among operators, it will cost about 2x for > serialization and deserialization between operators. > > Is there a better workaround that all the operators could notice the schema > change and at the same time not breaking the operator chaining? > > Thanks! &

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Piotr Nowojski
Hi, Broadcasting will brake an operator chain. However my best guess is that Kafka source will be still a performance bottleneck in your job. Also Network exchanges add some measurable overhead only if your records are very lightweight and easy to process (for example if you are using RocksDB

Re: Will broadcast stream affect performance because of the absence of operator chaining?

2019-08-06 Thread Piotr Nowojski
of the operator(for > example, operatorA), and other subtasks of the operator are not aware of it. > In this case is there anything I have missed? > > Thank you! > > > > > > > ------ Original -- > From: "Piotr Nowojski

Re: [SURVEY] What is the most subtle/hard to catch bug that people have seen?

2019-10-01 Thread Piotr Nowojski
Hi, Are you asking about bugs in Flink, in libraries that Flink is using or bugs in applications that were using Flink? From my perspective/what I have seen: The most problematic bugs while developing features for Flink: Dead locks & data losses caused by concurrency issues in network

Re: is Flink a database ?

2019-11-04 Thread Piotr Nowojski
Hi :) What do you mean by “a database”? A SQL like query engine? Flink is already that [1]. A place where you store the data? Flink kind of is that as well [2] and many users are using Flink as the source of truth, not just as a data processing framework. With Flink Table API/SQL [1], you can

Re: PreAggregate operator with timeout trigger

2019-10-30 Thread Piotr Nowojski
Hi, If you want to register a processing/event time trigger in your custom operator, you can take a look how other operators are doing it, by calling AbstractStreamOperator#getInternalTimerService [1]. You can look around the Flink’s code base for usages of this method, there are at least

Re: Flink 1.5+ performance in a Java standalone environment

2019-10-30 Thread Piotr Nowojski
Hi, In Flink 1.5 there were three big changes, that could affect performance. 1. FLIP-6 changes (As previously Yang and Fabian mentioned) 2. Credit base flow control (especially if you are using SSL) 3. Low latency network changes I would suspect them in that order. First and second you can

Re: low performance in running queries

2019-10-30 Thread Piotr Nowojski
Hi, I would also suggest to just attach a code profiler to the process during those 2 hours and gather some results. It might answer some questions what is taking so long time. Piotrek > On 30 Oct 2019, at 15:11, Chris Miller wrote: > > I haven't run any benchmarks with Flink or even used

Re: No FileSystem for scheme "file" for S3A in and state processor api in 1.9

2019-10-30 Thread Piotr Nowojski
Hi, Thanks for reporting the issue, I’ve created the jira ticket for that [1]. We will investigate it and try to address it somehow. Could you try out if the same issue happen when you use flink-s3-fs-presto [2]? Piotrek [1] https://issues.apache.org/jira/browse/FLINK-14574 [2]

Re: How to stream intermediate data that is stored in external storage?

2019-10-30 Thread Piotr Nowojski
Hi, I’m not sure what are you trying to achieve. What do you mean by “state of full outer join”? The result of it? Or it’s internal state? Also keep in mind, that internal state of the operators in Flink is already snapshoted/written down to an external storage during checkpointing mechanism.

Re: possible backwards compatibility issue between 1.8->1.9?

2019-10-31 Thread Piotr Nowojski
e in a KeyedProcessFunction, > starting without state was not an option for us. We just tried to restart it > with "operator chaining disabled", and then surprisingly it worked. > > How can we explain this different behaviour of the second job in test and > prod? The only

Re: How to stream intermediate data that is stored in external storage?

2019-10-30 Thread Piotr Nowojski
ming fashion > in an update made? > > Thanks! > > On Wed, Oct 30, 2019 at 7:25 AM Piotr Nowojski <mailto:pi...@ververica.com>> wrote: > Hi, > > I’m not sure what are you trying to achieve. What do you mean by “state of > full outer join”? The result of

Re: No FileSystem for scheme "file" for S3A in and state processor api in 1.9

2019-11-01 Thread Piotr Nowojski
Ok, thanks for the explanation now it makes sense. Previously I haven’t noticed that those snapshot state calls visible in your stack trace come from State Processor API. We will try to reproduce it, so we might have more questions later, but those information might be enough. One more

Re: low performance in running queries

2019-11-01 Thread Piotr Nowojski
Hi, > Is there a simple way to get profiling information in Flink? Flink doesn’t provide any special tooling for that. Just use your chosen profiler, for example: Oracle’s Mission Control (free on non production clusters, no need to install anything if already using Oracle’s JVM), VisualVM

Re: No FileSystem for scheme "file" for S3A in and state processor api in 1.9

2019-10-30 Thread Piotr Nowojski
But from the stack trace that you have posted it looks like you are using Hadoop’s S3 implementation for the checkpointing? If so, can you try using Presto and check whether you still encounter the same issue? Also, could you explain how to reproduce the issue? What configuration are you

Re: Document an example pattern that makes sources and sinks pluggable in the production code for testing

2019-11-14 Thread Piotr Nowojski
Hi, Doesn’t the included example `ExampleIntegrationTest` demonstrate the idea of > inject special test sources and test sinks in your tests. ? Piotrek > On 11 Nov 2019, at 13:44, Hung wrote: > > Hi guys, > > I found the testing part mentioned > > make sources and sinks pluggable in

Re: How to unsubscribe the Apache projects and jira issues notification

2019-11-15 Thread Piotr Nowojski
Hi, Please check the first link on google "unsubscribe user@flink.apache.org” Piotrek > On 15 Nov 2019, at 11:40, P. Ramanjaneya Reddy wrote: > > Hi > > Following blogs want to unsubscribe kindly guide. > > I tried from google..still mails receiving > > Also should unubscribe.. > >

Re: low performance in running queries

2019-11-04 Thread Piotr Nowojski
> On 11/1/2019 4:40 PM, Piotr Nowojski wrote: >> Hi, >> >> More important would be the code profiling output. I think VisualVM allows >> to share the code profiling result as “snapshots”? If you could analyse or >> share this, it would be helpful. &g

Re: PreAggregate operator with timeout trigger

2019-11-05 Thread Piotr Nowojski
nks! > > -- > -- Felipe Gutierrez > -- skype: felipe.o.gutierrez > -- https://felipeogutierrez.blogspot.com > <https://felipeogutierrez.blogspot.com/> > > On Wed, Oct 30, 2019 at 2:59 PM Piotr Nowojski <mailto:pi...@ververica.com>> wrote: > Hi, > > If you want to

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-11-21 Thread Piotr Nowojski
Hi, I would suspect this: https://issues.apache.org/jira/browse/FLINK-12070 To be the source of the problems. There seems to be a hidden configuration option that avoids using memory mapped files:

Re: CoGroup SortMerger performance degradation from 1.6.4 - 1.9.1?

2019-12-03 Thread Piotr Nowojski
Hi, Yes, it is only related to **batch** jobs, but not necessarily only to DataSet API jobs. If you are using for example Blink SQL/Table API to process some bounded data streams (tables), it could also be visible/affected there. If not, I would suggest to start a new user mailing list

Re: [ANNOUNCE] Zhu Zhu becomes a Flink committer

2019-12-13 Thread Piotr Nowojski
Congratulations! :) > On 13 Dec 2019, at 18:05, Fabian Hueske wrote: > > Congrats Zhu Zhu and welcome on board! > > Best, Fabian > > Am Fr., 13. Dez. 2019 um 17:51 Uhr schrieb Till Rohrmann < > trohrm...@apache.org>: > >> Hi everyone, >> >> I'm very happy to announce that Zhu Zhu accepted

Re: How does Flink handle backpressure in EMR

2019-12-05 Thread Piotr Nowojski
Hi Michael, As Roman pointed out Flink currently doesn’t support the auto-scaling. It’s on our roadmap but it requires quite a bit of preliminary work to happen before. Piotrek > On 5 Dec 2019, at 15:32, r_khachatryan wrote: > > Hi Michael > > Flink *does* detect backpressure but currently,

<    1   2   3   4   5   6   7   >