Blob Server Removes Failed Jobs Immediately

2018-06-20 Thread Dominik Wosiński
Hello, I'm not sure whether the problem is connected with bad configuration or it's some inconsistency in the documentation but according to this document: https://cwiki.apache.org/confluence/display/FLINK/FLIP-19%3A+Improved+BLOB+storage+architecture . *I*f a job fails, all non-HA files'

Re: How to set log level using Flink Docker Image

2018-06-21 Thread Dominik Wosiński
You can for example mount the *conf* directory using docker volumes. You would need to have *logback.xml* and then mount it as *conf/logback.xml *inside the image. Locally You could do this using *docker-compose.yml*, for mounting volumes in kubernetes refer to this page:

ODP: docker, error NoResourceAvailableException..

2018-08-15 Thread Dominik Wosiński
Hey, The problem is that your command does start Job Manager container, but it does not start the Task Manager . That is why you have 0 slots. Currently, the default numberOfTaskSlots is set to the number of CPUs avaialbe on the machine. So, You generally can to do 2 things: 1) Start Job

Re: Initial value of ValueStates of primitive types

2018-08-17 Thread Dominik Wosiński
Hey, After you call, by default values you mean after you call : getRuntimeContext.getState() If so, the default value will be state with *value() *of null, as described in : /** * Returns the current value for the state. When the state is not * partitioned the returned value is the same for

Re: Flink not rolling log files

2018-08-17 Thread Dominik Wosiński
I am using this *log4j.properties *file config for rolling files once per day and it is working perfectly. Maybe this will give You some hint: log4j.appender.file=org.apache.log4j.DailyRollingFileAppender log4j.appender.file.file=${log.file} log4j.appender.file.append=false

Re: Flink Jobmanager Failover in HA mode

2018-08-17 Thread Dominik Wosiński
I have faced this issue, but in 1.4.0 IIRC. This seems to be related to https://issues.apache.org/jira/browse/FLINK-10011. What was the status of the jobs when the main Job Manager has been stopped ? 2018-08-17 17:08 GMT+02:00 Helmut Zechmann : > Hi all, > > we have a problem with flink 1.5.2

Re: Flink checkpointing to Google Cloud Storage

2018-08-21 Thread Dominik Wosiński
Hey, >From my perspective, such issues always meant clashing dependencies in case of Flink. Have you checked the full dependency tree if there are no issues there ? Best Regards, Dominik.

Fwd: would you join a Slack workspace for Flink?

2018-08-26 Thread Dominik Wosiński
-- Forwarded message - From: Dominik Wosiński Date: niedz., 26 sie 2018 o 15:12 Subject: ODP: would you join a Slack workspace for Flink? To: Hequn Cheng Hey, I have been facing this issue for multiple open source projects and discussions. Slack in my opinion has two main

Dealing with Not Serializable classes in Java

2018-08-26 Thread Dominik Wosiński
Hey, I was wondering how do You normally deal with fields that contain references that are not serializable. Say, we have a custom serialization schema in Java that needs to serialize *LocalDateTime* field with *ObjectMapper.* This requires registering specific module for *ObjectMapper* and this

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński
new > ObjectMapper().registerModule(new JavaTimeModule()); > > On 27.08.2018 10:28, Dominik Wosiński wrote: > > Hey Paul, > Yeah that is possible, but I was asking in terms of serialization schema. > So I would really want to avoid RichFunction :) > > Best Regards, >

Re: Dealing with Not Serializable classes in Java

2018-08-27 Thread Dominik Wosiński
ing it. > If the ObjectMapper is thread-safe you could also initialize it as a > static field. > > On 26.08.2018 17:58, Dominik Wosiński wrote: > > Hey, > > I was wondering how do You normally deal with fields that contain > references that are not serializable. Say, we have a custom

Re: Cluster die when one of the TM killed

2018-08-20 Thread Dominik Wosiński
Hey, Can You please provide a little more information about your setup and maybe logs showing when the crash occurs? Best Regards, Dominik 2018-08-20 16:23 GMT+02:00 Siew Wai Yow : > Hi, > > > When one of the task manager is killed, the whole cluster die, is this > something expected? We are

ODP: API for delayed/scheduled interval input source for integrationtests

2018-09-01 Thread Dominik Wosiński
Hey, Maybe it would be a good idea to create somekind of test source for DataStream that allows writing elements to stream directly. Similarly like it’s done for reactive libraries sources. This would make creating tests a lot easier for Flink. Best Regards, Dom. Wysłane z aplikacji Poczta

ODP: How to add jvm Options when using Flink dashboard?

2018-09-05 Thread Dominik Wosiński
Hey, You can’t as Chesnay have already said, but for your usecase you could use arguments instead of JVM option and they will work equally good. Best Regards, Dom. Wysłane z aplikacji Poczta dla Windows 10 Od: Chesnay Schepler Wysłano: środa, 5 września 2018 11:43 Do: zpp;

Re: Duplicates in self join

2018-10-08 Thread Dominik Wosiński
Hey, IMHO, the simplest way in your case would be to use the Evictor to evict duplicate values after the window is generated. Have look at it here: https://ci.apache.org/projects/flink/flink-docs-release-1.6/api/java/org/apache/flink/streaming/api/windowing/evictors/Evictor.html Best Regards,

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński
Hey, What is the exact issue that you are facing and the Flink version that you are using ?? Best Regards, Dom. pt., 12 paź 2018 o 16:11 Krishna Kalyan napisał(a): > Hello All, > > I need some help making async API calls. I have tried the following code > below. > > class

Re: Making calls to external API wit Data Streams

2018-10-12 Thread Dominik Wosiński
syncWeatherAPIRequest > > > > On Fri, 12 Oct 2018 at 16:21, Dominik Wosiński wrote: > >> Hey, >> What is the exact issue that you are facing and the Flink version that >> you are using ?? >> >> >> Best Regards, >> Dom. >> >> pt., 12 paź 201

Re: Mapstatedescriptor

2018-10-13 Thread Dominik Wosiński
Hey, It's the name for the whole descriptor. Not the keys, it means that no other descriptor should be created with the same name. Best Regards, Dom. Sob., 13.10.2018, 09:50 użytkownik Szymon napisał: > > > Hi, i have a question about MapStateDescriptor used to create MapState. > I have a

Re: Kafka connector error: This server does not host this topic-partition

2018-10-23 Thread Dominik Wosiński
Hey Alexander, It seems that this issue occurs when the broker is down and the partition is selecting the new leader AFAIK. There is one JIRA issue I have found, not sure if that's what are You looking for: https://issues.apache.org/jira/browse/KAFKA-6221 This issue is connected with Kafka itself

Re: KafkaException or ExecutionStateChange failure on job startup

2018-10-23 Thread Dominik Wosiński
Hey Mark, Do You use more than 1 Kafka consumer for Your jobs? I think this relates to the known issue in Kafka: https://issues.apache.org/jira/browse/KAFKA-3992. The problem is that if You don't provide client ID for your *KafkaConsumer* Kafka assigns one, but this is done in an unsynchronized

Re: Manual trigger the window in fold operator or incremental aggregation

2018-10-24 Thread Dominik Wosiński
Hey Zhen Li, What are You trying to do exactly? Maybe there is a more suitable method than manually triggering windows available in Flink. Best Regards, Dom. śr., 24 paź 2018 o 09:25 Dawid Wysakowicz napisał(a): > Hi Zhen Li, > > As far as I know that is not possible. For such custom handling

Re: Field could not be resolved by the field mapping when using kafka connector

2018-11-15 Thread Dominik Wosiński
Hey, Could You please show a sample data that You want to process? This would help in verifying the issue. Best Regards, Dom. wt., 13 lis 2018 o 13:58 Jeff Zhang napisał(a): > Hi, > > I hit the following error when I try to use kafka connector in flink table > api. There's very little

Re: Flink with parallelism 3 is running locally but not on cluster

2018-11-15 Thread Dominik Wosiński
Hey, DId You try to run any other job on your setup? Also, could You please tell what are the sources you are trying to use, do all messages come from Kafka?? >From the first look, it seems that the JobManager can't connect to one of the TaskManagers. Best Regards, Dom. pon., 12 lis 2018 o

Re: Field could not be resolved by the field mapping when using kafka connector

2018-11-15 Thread Dominik Wosiński
Hey, Thanks for the info, I haven't noticed that. I was just going through older messages with no responses. Best Regards, Dom.

Re: Flink with parallelism 3 is running locally but not on cluster

2018-11-15 Thread Dominik Wosiński
PS. Could You also post the whole log for the application run ?? Best Regards, Dom. czw., 15 lis 2018 o 11:04 Dominik Wosiński napisał(a): > Hey, > > DId You try to run any other job on your setup? Also, could You please > tell what are the sources you are trying to use, do all m

Re: JSON to CEP coversion

2019-01-22 Thread Dominik Wosiński
Hey Anish, I have done some abstraction over the logic of CEP, but with the use of Apache Bahir[1], which introduces SIddhi CEP[2][ engine that allows SQL like definitions of the logic. Best, Dom. [1] https://github.com/apache/bahir [2] https://github.com/wso2/siddhi wt., 22 sty 2019 o 20:20

Re: Getting RemoteTransportException

2019-01-17 Thread Dominik Wosiński
*Hey,* As for the question about taskmanager.network.netty.server.numThreads . It is the size of the thread pool that will be used by the netty server. The default value is -1,

Re: Kafka consumer, is there a way to filter out messages using key only?

2018-12-27 Thread Dominik Wosiński
Hey, AFAIK, returning null from deserialize function in FlinkKafkaConsumer will indeed filter the message out and it won't be further processed. Best Regards, Dom. śr., 19 gru 2018 o 11:06 Dawid Wysakowicz napisał(a): > Hi, > > I'm afraid that there is no out-of-the box solution for this, but

Re: Changes in Flink 1.6.2

2018-11-30 Thread Dominik Wosiński
h.JobGraph > and method getJobGraph in class StreamingPlan of type > ()org.apache.flink.runtime.jobgraph.JobGraph > match expected type ? > System.out.println("[info] Job ID: " + > env.getStreamGraph.getJobGraph.getJobID) > > Boris Lublinsky > FDP Architect > bo

Re: Flink SQL

2018-11-30 Thread Dominik Wosiński
Hey, Not exactly sure by what you mean by "nothing" but generally the concept is. The data is fed to the dynamic table and the result of the query creates another dynamic table. So, if the resulting query returns an empty table, no data will indeed be written to the S3. Not sure if this was what

Re: Store Predicate or any lambda in MapState

2018-11-21 Thread Dominik Wosiński
Hey Jayant, I don't really think that the sole fact of using Predicate should cause the *ClassNotFoundException* that You are talking about. The exception may come from the fact that some libraries are missing from Your cluster environment. Have You tried running the job locally to verify that

Re: Can JDBCSinkFunction support exectly once?

2018-11-21 Thread Dominik Wosiński
Hey, As far as I know, the function needs to implement the *TwoPhaseCommitFunction* and not the *CheckpointListener. JDBCSinkFunction *does not implement the two-phase commit, so currently it does not support exactly once. Best Regards, Dom. śr., 21 lis 2018 o 11:07 Jocean shi napisał(a): >

Re: Kinesis Connector - NoClassDefFoundError

2018-11-20 Thread Dominik Wosiński
Hey, Have you updated the versions both on the environment and the dependency on the job? >From my personal experience, 95 % of such issues is due to the mismatch between Flink versions on the cluster you are using and Your job. Best Regards, Dom. wt., 20 lis 2018 o 07:41 Steve Bistline

Re: Changes in Flink 1.6.2

2018-11-28 Thread Dominik Wosiński
Hey, Could you show the message that You are getting? Best Regards, Dom. śr., 28 lis 2018 o 19:08 Boris Lublinsky napisał(a): > > > > Prior to Flink version 1.6.2 including 1.6.1 > env.getStreamGraph.getJobGraph was happily returning currently defined > Graph, but in 1.6.2 this fails to compile

Re: Passing vm options

2019-01-07 Thread Dominik Wosiński
Hey, AFAIK, Flink supports dynamic properties currently only on YARN and not really in standalone mode. If You are using YARN it should indeed be possible to set such configuration. If not, then I am afraid it is not possible. Best Regards, Dom. pon., 7 sty 2019 o 09:01 Avi Levi napisał(a): >

Re: kafka corrupt record exception

2019-04-02 Thread Dominik Wosiński
Hey, As far as I understand the error is not caused by the deserialization but really by the polling of the message, so custom deserialization schema won't really help in this case. There seems to be an error in the messages in Your topic. You can see here

WebUI show custom config

2019-06-21 Thread Dominik Wosiński
Hey, I am building jobs that use Typesafe Config under the hood. The configs tend to grow big. I was wondering whether there is a possibility to use WebUI to show the config that the job was run with, currently the only idea is to log the config and check it inside the logs, but with dozens of

Re: Flink Control Stream

2019-04-25 Thread Dominik Wosiński
Thanks for help Till, I thought so, but I wanted to be sure. Best Regards, Dom.

Re: kafka corrupt record exception

2019-04-25 Thread Dominik Wosiński
Hey, Sorry for such a delay, but I have missed this message. Basically, technically you could have Kafka broker installed in version say 1.0.0 and using FlinkKafkaConsumer08. This could technically create issues. I'm not sure if You can automate the process of skipping corrupted messages, as You

Flink Control Stream

2019-04-24 Thread Dominik Wosiński
Hey, I wanted to use the control stream to dynamically adjust parameters of the tasks. I know that it is possible to use *connect()* and *BroadcastState *to obtain such a thing. But I would like to have the possibility to control the parameters inside the *AsyncFunction. *Like specific timeout for

HadoopInputFormat

2019-11-06 Thread Dominik Wosiński
Hey, I wanted to ask if the *HadoopInputFormat* does currently support some custom partitioning scheme ? Say I have 200 files in HDFS each having the partitioning key in name, can we ATM use HadoopInputFormat to distribute reading to multiple TaskManagers using the key ?? Best Regards, Dom.

Re: Implementing a tick service

2020-01-21 Thread Dominik Wosiński
Hey, you have access to context in `onTimer` so You can easily reschedule the timer when it is fired. Best, Dom.

Objects with fields that are not serializable

2020-04-14 Thread Dominik Wosiński
Hey, I have a question about using classes with fields that are not serializable in DataStream. Basically, I would like to use the Java's Optional in DataStream. So Say I have a class *Data *that has several optional fields and I would like to have *DataStream*. I don't think this should cause

StreamQueryConfig vs TemporalTableFunction

2020-04-20 Thread Dominik Wosiński
Hey, I wanted to ask whether the TemporalTableFunctions are subject to StreamQueryConfig retention? I was pretty sure that they are not, but I have recently noticed weird behavior in one of my jobs that suggests that they indeed are. Thanks for answers, Best Regards, Dom.

Issues with Watermark generation after join

2020-03-16 Thread Dominik Wosiński
Hey, I have noticed a weird behavior with a job that I am currently working on. I have 4 different streams from Kafka, lets call them A, B, C and D. Now the idea is that first I do SQL Join of A & B based on some field, then I create append stream from Joined A, let's call it E. Then I need to

Re: Issues with Watermark generation after join

2020-03-16 Thread Dominik Wosiński
ampsAndWatermarks` directly > post to the (co-)process function. > > Best regards > Theo > > -- > *Von: *"Dominik Wosiński" > *An: *"user" > *Gesendet: *Montag, 16. März 2020 16:55:18 > *Betreff: *Issues with Watermark g

Re: How to consume kafka from the last offset?

2020-03-26 Thread Dominik Wosiński
Hey, Are You completely sure you mean *auto.offset.reset ?? *False is not valid setting for that AFAIK. Best, Dom. czw., 26 mar 2020 o 08:38 Jim Chen napisał(a): > Thanks! > > I made a mistake. I forget to set the auto.offset.reset=false. It's my > fault. > > Dominik Wosińsk

Re: How to consume kafka from the last offset?

2020-03-25 Thread Dominik Wosiński
Hi Jim, Well, *auto.offset.reset *is only used when there is no offset saved for this *group.id * in Kafka. So, if You want to read the data from the latest record (and by latest I mean the newest here) You should assign the *group.id * that was not previously

Timestamp Erasure

2020-03-18 Thread Dominik Wosiński
Hey, I just wanted to ask about one thing about timestamps. So, currently If I have a KeyedBroadcastProcess function followed by Temporal Table Join, it works like a charm. But, say I want to delay emitting some of the results due to any reason. So If I *registerProcessingTimeTimer* and any

Re: Timestamp Erasure

2020-03-19 Thread Dominik Wosiński
ails in the doc [1]. > > Best, > Jark > > [1]: > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/time_attributes.html#during-datastream-to-table-conversion-1 > > On Thu, 19 Mar 2020 at 02:38, Dominik Wosiński wrote: > >> Hey, >> I just want

Re: Issues with Watermark generation after join

2020-03-19 Thread Dominik Wosiński
I have created a simple minimal reproducible example that shows what I am talking about: https://github.com/DomWos/FlinkTTF/tree/sql-ttf It contains a test that shows that even if the output is in order which is enforced by multiple sleeps, then for parallelism > 1 there is no output and for

Re: Issues with Watermark generation after join

2020-03-17 Thread Dominik Wosiński
Hey sure, the original Temporal Table SQL is: |SELECT e.*, f.level as level FROM | enablers AS e, | LATERAL TABLE (Detectors(e.timestamp)) AS f | WHERE e.id= f.id |"" And the previous SQL query to join A is something like : SELECT * | FROM A te, | B s | WHERE s.id = te.id AND s.level = te.level

Re: Issues with Watermark generation after join

2020-03-24 Thread Dominik Wosiński
ally > for the job: > > watermark of operator = min(previous operator partition 1, previous > operator partition 2, ...) > > I hope this helps. > > Regards, > Timo > > > On 19.03.20 16:38, Dominik Wosiński wrote: > > I have created a simple minimal reproducible ex

Fwd: AfterMatchSkipStrategy for timed out patterns

2020-03-16 Thread Dominik Wosiński
Hey all, I was wondering whether for CEP the *AfterMatchSkipStrategy *is applied during matching or if simply the results are removed after the match. The question is the result of the experiments I was doing with CEP. Say I have the readings from some sensor and I want to detect events over some

Flink KafkaProducer Failed Transaction Stalling the whole flow

2023-12-18 Thread Dominik Wosiński
Hey, I've got a question regarding the transaction failures in EXACTLY_ONCE flow with Flink 1.15.3 with Confluent Cloud Kafka. The case is that there is a FlinkKafkaProducer in EXACTLY_ONCE setup with default *transaction.timeout.ms *of 15min. During the