Re: [DISCUSS] Flink Kubernetes Operator controller flow

2022-03-01 Thread Gyula Fóra
it is ready with the base changes according to the discussion. Once we have this merged it will be easier to parallelize work on the state machine/observation/reconcile logic. Gyula On Tue, Mar 1, 2022 at 11:24 AM Gyula Fóra wrote: > Hi All! > > Based on your ideas and suggestions I have m

Re: [DISCUSS] Flink Kubernetes Operator controller flow

2022-03-01 Thread Gyula Fóra
ine is going to replace the annoying if-else in the > reconciler. > It seems to have no conflicts with modular mechanism(aka observer, > reconciler, etc.). > And we could make them happen step by step. > > > Best, > Yang > > > Gyula Fóra 于2022年3月1日周二 13:53写道: > >

Re: [DISCUSS] Flink Kubernetes Operator controller flow

2022-02-28 Thread Gyula Fóra
of the expected state and what it should observe on the happy path. Gyula On Tue, 1 Mar 2022 at 06:42, Gyula Fóra wrote: > @Thomas: > > Thanks for the input! I completely agree with a well principled state > machine based approach in the operator would be the best. > > You are r

Re: [DISCUSS] Flink Kubernetes Operator controller flow

2022-02-28 Thread Gyula Fóra
> [1] > https://github.com/lyft/flinkk8soperator/blob/master/docs/state_machine.md > > On Mon, Feb 28, 2022 at 8:16 AM Gyula Fóra wrote: > > > > Hi All! > > > > Thank you for the feedback, I agree with what has been proposed to > include > > as m

Re: [DISCUSS] Flink Kubernetes Operator controller flow

2022-02-28 Thread Gyula Fóra
idator` `observer` `reconciler` makes lots > of > > > > sense. And the "Validate -> Observe -> Reconcile" > > > > flow seems natural to me. > > > > > > > > Regarding the implementation in the PR, instead of directly using the > >

[DISCUSS] Flink Kubernetes Operator controller flow

2022-02-28 Thread Gyula Fóra
Hi All! I would like to start a discussion thread regarding the structure of the Kubernetes Operator controller

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-20 Thread Gyula Fóra
Hi! Thank you for your interest in contributing to the operator. The operator persists information in the status of the FlinkDeployment resource. We should not need any additional persistence layer on top of this in the current design. Could you please give me a concrete example of what is not

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-16 Thread Gyula Fóra
d be "running". The only API > that I am aware of that has something like "suspended" is a Kubernetes Job > ( > > https://kubernetes.io/docs/concepts/workloads/controllers/job/#suspending-a-job > ), > which looks retrofitted to me. > > Cheers, >

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-16 Thread Gyula Fóra
n't a library but effectively just a docker > > image, the ability to change the Java version isn't as critical as it > > is for Flink core, which needs to run in many different environments. > > > > Cheers, > > Thomas > > > > On Tue, Feb 15, 2022 at

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-15 Thread Gyula Fóra
lso lean to not introduce the savepoint/checkpoint related fields to the > job spec, especially in the very beginning of flink-kubernetes-operator. > > > Best, > Yang > > Gyula Fóra 于2022年2月15日周二 19:02写道: > > > Hi Peng Yuan! > > > > While I do agree that

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-15 Thread Gyula Fóra
pache/flink/flink-examples-streaming_2.12/1.14.3/flink-examples-streaming_2.12-1.14.3-WordCount.jar > > > > - name: DEST_PATH > > > > value: /cache/flink-app.jar > > > > command: ['sh', '-c', 'curl -o ${DEST_PATH} ${JAR_URL}'] > > > > &g

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-14 Thread Gyula Fóra
Best Wishes! > Peng Yuan > > On Mon, Feb 14, 2022 at 4:14 PM Gyula Fóra wrote: > > > Hi Peng Yuan! > > > > The repo is already created: > > https://github.com/apache/flink-kubernetes-operator > > > > We will open the PR with the initial prototype

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-14 Thread Gyula Fóra
of flink-kubernetes-operator been created in github? > > Peng Yuan > > On Wed, Feb 9, 2022 at 1:23 AM Gyula Fóra wrote: > > > I agree with flink-kubernetes-operator as the repo name :) > > Don't have any better idea > > > > Gyula > > > > On S

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-08 Thread Gyula Fóra
ink-k8s-operator" but that would be almost > identical to existing operator implementations and could lead to > confusion in the future. > > Thoughts? > > Thanks, > Thomas > > > > On Fri, Feb 4, 2022 at 5:15 AM Gyula Fóra wrote: > > > > Hi Danny, >

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-07 Thread Gyula Fóra
> > > > > > Does this mean if someone wants to upgrade Flink to a version that is > > released after the operator version that is being used, he/she would need > > to upgrade the operator version first? > > I'm not questioning this, just trying to make sure I'm un

[DISCUSS] Checkpointing (partially) failing jobs

2022-02-06 Thread Gyula Fóra
Hi all! At the moment checkpointing only works for healthy jobs with all running (or some finished) tasks. This sounds reasonable in most cases but there are a few applications where it would make sense to checkpoint failing jobs as well. Due to how the checkpointing mechanism works, subgraphs

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-06 Thread Gyula Fóra
rator > > I thought "flink-operator" could be a bit misleading since the term > operator already has a meaning in Flink. > > I also considered "flink-k8s-operator" but that would be almost > identical to existing operator implementations and could lead to > confusion

Re: [VOTE] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-05 Thread Gyula Fóra
Hi Thomas! +1 (binding) from my side Happy to see this effort getting some traction! Cheers, Gyula On Sat, Feb 5, 2022 at 3:00 AM Thomas Weise wrote: > Hi everyone, > > I'd like to start a vote on FLIP-212: Introduce Flink Kubernetes > Operator [1] which has been discussed in [2]. > > The

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-04 Thread Gyula Fóra
een discussing this one with my team. We are interested in the > Standalone mode, and are willing to contribute towards the implementation. > Potentially we can work together to support both modes in parallel? > > Thanks, > > On Wed, Feb 2, 2022 at 4:02 PM Gyula Fóra wrote: > >

Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

2022-02-04 Thread Gyula Fóra
s into Flink that we >>> regretted >>> later. >>> >>> As I've said, I don't have a better idea right now. If we believe that it >>> is the right thing to make Flink responsible for distributing the tokens >>> and we don't find a better solution then we'

Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

2022-02-03 Thread Gyula Fóra
Hi Team! Let's all calm down a little and not let our emotions affect the discussion too much. There has been a lot of effort spent from all involved parties so this is quite understandable :) Even though not everyone said this explicitly, it seems that everyone more or less agrees that a

Re: [DISCUSS] FLIP-211: Kerberos delegation token framework

2022-02-03 Thread Gyula Fóra
Hi Till! The delegation token framework solves a few production problems, KDC scalability is just one and probably not the most important. As Gabor has explained some of which are: - Solves the problem for token renewal for long running jobs which would currently time out and die - Improves

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-02-02 Thread Gyula Fóra
Hi Danny! Thanks for the feedback :) Versioning: Versioning will be independent from Flink and the operator will depend on a fixed flink version (in every given operator version). This should be the exact same setup as with Stateful Functions ( https://github.com/apache/flink-statefun). So

Re: [DISCUSS] FLIP-212: Introduce Flink Kubernetes Operator

2022-01-27 Thread Gyula Fóra
Hi All! Thanks for the questions, there are still quite a few unknowns and decisions to be made but here are my current thoughts: # 1 Flink Native vs Standalone integration Maybe we should make this more clear in the FLIP but we agreed to do the first version of the operator based on the native

Re: [VOTE] FLIP-211: Kerberos delegation token framework

2022-01-24 Thread Gyula Fóra
Hi Gabor, +1 (binding) from me This is a great effort and significant improvement to the Kerberos security story . Cheers Gyula On Fri, 21 Jan 2022 at 15:58, Gabor Somogyi wrote: > Hi devs, > > I would like to start the vote for FLIP-211 [1], which was discussed and > reached a consensus in

Re: Flink native k8s integration vs. operator

2022-01-17 Thread Gyula Fóra
;>> late. > >>> >> > > >>> >> > I agree with Thomas' and David's assessment of Flink's "Native > >>> >> Kubernetes > >>> >> > Integration", in particular, it does actually not integrate well

Re: [ANNOUNCE] Dropping "CheckpointConfig.setPreferCheckpointForRecovery()"

2021-08-25 Thread Gyula Fóra
Hi Stephan, I do not know if anyone is still relying on this but I think it makes sense to drop this feature. So +1 from me. I think it served a valid purpose originally but if we have a good improvement in the pipeline using the savepoints directly that will solve the problem properly. I would

Re: [VOTE] FLIP-181: Custom netty HTTP request inbound/outbound handlers

2021-07-06 Thread Gyula Fóra
+1 from my side This is a good addition that will open many possibilities in the future and solve some immediate issues with the current Kerberos integration. Gyula On Tue, Jul 6, 2021 at 2:50 PM Márton Balassi wrote: > Hi everyone, I would like to start a vote on FLIP-181 [1] which was >

Re: [DISCUSS] Dashboard/HistoryServer authentication

2021-06-04 Thread Gyula Fóra
Hi! I think there might be possible alternatives but it seems Kerberos on the rest endpoint ticks all the right boxes and provides a super clean and simple solution for strong authentication. I wouldn’t even consider sidecar proxies etc if we can solve it in such a simple way as proposed by G.

Re: [DISCUSS] Simplify SQL lookup join (temporal join with latest) syntax

2021-04-20 Thread Gyula Fóra
e already have a thread of this topic: > > http://apache-flink-mailing-list-archive.1008284.n3.nabble.com/DISCUSS-Make-Temporal-Join-syntax-easier-to-use-td47296.html > FYI. > > Best, > Jingsong > > On Tue, Apr 20, 2021 at 4:55 PM Gyula Fóra wrote: > > > Hi All!

[DISCUSS] Simplify SQL lookup join (temporal join with latest) syntax

2021-04-20 Thread Gyula Fóra
Hi All! Playing around with the SQL syntax for temporal join with latest table I feel there is some room for optimizing the current syntax to provide a better user experience. The current system for specifying the lookup side is: lookuptable FOR SYSTEM_TIME AS OF probe.proctime_column It feels

Re: Source of Cloudera 3 Hadoop Flink shaded uber jars?

2021-02-10 Thread Gyula Fóra
Hi Adam! Cloudera has published these artifacts before but they are no longer used similarly to the Flink project. You should set the HADOOP_CLASSPATH to the hadoop libs instead of using the shaded jars. Cheers Gyula On Wed, Feb 10, 2021 at 7:27 PM Chesnay Schepler wrote: > I can't tell you

Re: [DISCUSS] Releasing Apache Flink 1.11.3

2020-11-18 Thread Gyula Fóra
Hi All! I have found the following issue today which might be considered a blocker for this release as well: https://issues.apache.org/jira/browse/FLINK-20221 Could someone please quickly provide a second set of eyes and confirm that this is indeed a big problem? :) Thank you! Gyula On Wed,

Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-08-07 Thread Gyula Fóra
tuff into Flink, and legacy, >> experimental or unstable connectors into Bahir. >> >> >> Who can take care of this effort? (Decide which Hbase 2 PR to take, >> review and contribution to Bahir) >> >> >> [1] https://issues.apache.org/jira/browse/FLINK-18795 &

Re: Re: [DISCUSS] Upgrade HBase connector to 2.2.x

2020-06-19 Thread Gyula Fóra
d user-zh@ to hear more feedback from users. > > Best, > Jark > > On Thu, 18 Jun 2020 at 21:25, Gyula Fóra wrote: > >> Hi All! >> >> I would like to revive an old ticket >> <https://issues.apache.org/jira/browse/FLINK-9849> and discussion around >&g

[DISCUSS] Upgrade HBase connector to 2.2.x

2020-06-18 Thread Gyula Fóra
Hi All! I would like to revive an old ticket and discussion around upgrading the HBase version of the connector. The current HBase version is 1.4.3 which is over 2 years old at this point and incompatible with the newer HBase versions used at

Re: [DISCUSS] Rework History Server into Global Dashboard

2020-05-15 Thread Gyula Fóra
ad of adding more complexity and more code to maintain. > > Maybe we could add this feature as a Flink package instead. That way it > would still be available to our users. If it gains enough traction then we > can also add it to Flink later. What do you think? > > Cheers, > T

Re: [DISCUSS] Rework History Server into Global Dashboard

2020-05-13 Thread Gyula Fóra
It seems that not everyone can see the screenshot in the email, so here is a link: https://drive.google.com/open?id=1abrlpI976NFqOZSX20k2FoiAfVhBbER9 On Wed, May 13, 2020 at 11:29 AM Gyula Fóra wrote: > Oops I forgot the screenshot, thanks Ufuk :D > > > @Jeff Zhang : Yes we

Re: [DISCUSS] Rework History Server into Global Dashboard

2020-05-13 Thread Gyula Fóra
y helpful for flink jobs and cluster > operations. Do you call flink rest api to gather the job info ? I hope this > history server could work with multiple versions of flink as long as the > flink rest api is compatible. > > Gyula Fóra 于2020年5月13日周三 下午4:13写道: > > > Hi All

[DISCUSS] Rework History Server into Global Dashboard

2020-05-13 Thread Gyula Fóra
Hi All! With the growing number of Flink streaming applications the current HS implementation is starting to lose its value. Users running streaming applications mostly care about what is running right now on the cluster and a centralised view on history is not very useful. We have been

Re: UpsertStreamTableSink vs OverwritableTableSink

2020-05-06 Thread Gyula Fóra
> > https://cwiki.apache.org/confluence/display/FLINK/FLIP-95%3A+New+TableSource+and+TableSink+interfaces#FLIP-95:NewTableSourceandTableSinkinterfaces-SinkInterfaces > > > Best, > Jingsong Lee > > On Wed, May 6, 2020 at 8:48 PM Gyula Fóra wrote: > > > Hi all!

UpsertStreamTableSink vs OverwritableTableSink

2020-05-06 Thread Gyula Fóra
Hi all! While working on a Table Sink implementation for Kudu (key-value store) , we got a bit confused about the expected semantics of UpsertStreamTableSink vs OverwritableTableSink and statements like INSERT vs INSERT OVERWRITE I am wondering what external operation should each incoming record

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-27 Thread Gyula Fóra
Hi Jack! You can find the document here: https://docs.google.com/document/d/1wSgzPdhcwt-SlNBBqL-Zb7g8fY6bN8JwHEg7GCdsBG8/edit?usp=sharing The document links to an already working Atlas hook prototype (and accompanying flink fork). The links for that are also here: Flink:

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-13 Thread Gyula Fóra
a method that in some > cases works as expected and in some other ones it does not. It would > be nice if we could expose consistent behaviour to the users. > > On Thu, Mar 12, 2020 at 8:44 PM Gyula Fóra wrote: > > > > Thanks Kostas, I have to review the possible limitations

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-12 Thread Gyula Fóra
to go "public > > > stable", but doesn't really do it exactly because of mixing "pipeline" > into > > > this. > > > You would need to cast "Pipeline" and work on internal classes in the > > > implementation. > > > > >

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-03-12 Thread Gyula Fóra
Hi Stephan! Thanks for checking this out. I agree that wrapping the other PipelineExecutors with a delegating AtlasExecutor would be a good alternative approach to implement this but I actually feel that it suffers even more problems than exposing the Pipeline instance in the JobListener. The

Re: Creating TemporalTable based on Catalog table in SQL Client

2020-03-04 Thread Gyula Fóra
I guess it will only work now if you specify the catalog name too when referencing the table. On Wed, Mar 4, 2020 at 11:15 AM Gyula Fóra wrote: > You are right but still if the default catalog is something else and > that's the one containing the table then it still wont work cur

Re: Creating TemporalTable based on Catalog table in SQL Client

2020-03-04 Thread Gyula Fóra
> setting an already registered catalog as the current one. As you can see > from the method and its comment, catalogs are loaded first before any > tables in yaml are registered, so you should be able to achieve what you > described. > > Bowen > > On Tue, Mar 3, 2020 at 5:16 A

Creating TemporalTable based on Catalog table in SQL Client

2020-03-03 Thread Gyula Fóra
Hi all! I was testing the TemporalTable functionality in the SQL client while using the Hive Catalog and I ran into the following problem. I have a table created in the Hive catalog and I want to create a temporal table over it. As we cannot create temporal tables in SQL directly I have to

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-20 Thread Gyula Fóra
for that are also here: Flink: https://github.com/gyfora/flink/tree/atlas-changes Atlas: https://github.com/gyfora/atlas/tree/flink-bridge Thank you! Gyula On Thu, Feb 13, 2020 at 4:43 PM Gyula Fóra wrote: > Thanks for the feedback Aljoscha! > > I have a POC ready with the Flink changes +

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-13 Thread Gyula Fóra
ine should be ok. Using the internal > StreamGraph might be problematic because this might change/break but > that's a problem of the external code. > > Aljoscha > > On 11.02.20 16:26, Gyula Fóra wrote: > > Hi All! > > > > I have made a prototype

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-11 Thread Gyula Fóra
want to register in Atlas from the available connectors. Ideally users could also somehow register their own Atlas metadata for custom sources and sinks, we could probably introduce an interface for that in Atlas. Cheers, Gyula On Fri, Feb 7, 2020 at 10:37 AM Gyula Fóra wrote: > Maybe we co

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-07 Thread Gyula Fóra
AM Gyula Fóra wrote: > Hi Aljoscha! > > That's a valid concert but we should try to figure something out, many > users need this before they can use Flink. > > I think the closest thing we have right now is the StreamGraph. In > contrast with the JobGraph the StreamGrap

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-07 Thread Gyula Fóra
e need it, we can probably beef up the JobListener to allow > accessing some information about the whole graph or sources and sinks. > My only concern right now is that we don't have a stable interface for > our job graphs/pipelines right now. > > Best, > Aljoscha > > On 06

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-06 Thread Gyula Fóra
istener which is invoked after job submission and > finished. May we can add api on JobClient to get what info you needed for > altas integration. > > > https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/core/execution/JobListener.java#L46 > > &

Re: [Discussion] Job generation / submission hooks & Atlas integration

2020-02-05 Thread Gyula Fóra
have tool which receives job submissions, extracts the required > > > information, forwards the job submission to Flink, monitors the > execution > > > result and finally publishes some information to Atlas (modulo some > other > > > steps which are missing in my descr

[Discussion] Job generation / submission hooks & Atlas integration

2020-02-05 Thread Gyula Fóra
Hi all! We have started some preliminary work on the Flink - Atlas integration at Cloudera. It seems that the integration will require some new hook interfaces at the jobgraph generation and submission phases, so I figured I will open a discussion thread with my initial ideas to get some early

Re: [DISCUSS] Multi-topics consuming from KafkaTableSource

2019-11-26 Thread Gyula Fóra
Hi all, I think the same reasoning applies here as with the regular Kafka consumer sources. Those support multiple topics, even topic patterns assuming the same schema. Some times you have so many topics that you can't create a source for each or as with the pattern the topics might be changing

Re: [DISCUSSION] Kafka Metrics Reporter

2019-11-20 Thread Gyula Fóra
a > >> new JSON format. Also in many cases, there could be a lot of metric > >> messages sent by the Flink jobs. JSON format is less efficient and > >> might > >> have too much overhead in that case. > >> > >> Thanks, > >> &g

[DISCUSSION] Kafka Metrics Reporter

2019-11-17 Thread Gyula Fóra
Hi all! Several users have asked in the past about a Kafka based metrics reporter which can serve as a natural connector between arbitrary metric storage systems and a straightforward way to process Flink metrics downstream. I think this would be an extremely useful addition but I would like to

Re: [VOTE] FLIP-59: Enable execution configuration from Configuration object

2019-11-14 Thread Gyula Fóra
+1 (binding) On Thu, Nov 14, 2019 at 11:07 AM Jeff Zhang wrote: > +1 (non-binding) > > Kostas Kloudas 于2019年11月14日周四 下午6:04写道: > > > +1 (binding) > > > > On Tue, Nov 12, 2019 at 10:20 AM tison wrote: > > > > > > +1 (binding) > > > > > > Best, > > > tison. > > > > > > > > > Aljoscha Krettek

Re: [DISCUSS] Flink Avro Cloudera Registry (FLINK-14577)

2019-11-06 Thread Gyula Fóra
l > record (de)serialization happens. It also provides a unified way of > instantiating the TypeInformation. Could you give some explanation why > would you prefer not to use this approach? > > Best, > > Dawid > > On 05/11/2019 14:48, Gyula Fóra wrote: > > Thanks Matyas f

Re: [DISCUSS] Flink Avro Cloudera Registry (FLINK-14577)

2019-11-05 Thread Gyula Fóra
Thanks Matyas for starting the discussion! I think this would be a very valuable addition to Flink as many companies are already using the Hortonworks/Cloudera registry and it would enable them to connect to Flink easily. @Dawid: Regarding the implementation this a much more lightweight connector

Re: [VOTE] Accept Stateful Functions into Apache Flink

2019-10-31 Thread Gyula Fóra
+1 from me, this is a great addition to Flink! Gyula On Thu, Oct 31, 2019, 03:52 Yun Gao wrote: > +1 (non-binding) > Very thanks for bringing this to the community! > > > -- > From:jincheng sun > Send Time:2019 Oct. 31

Re: [DISCUSS] Improve Flink logging with contextual information

2019-10-18 Thread Gyula Fóra
ase let me know if any of the suggestions above helps. > > > > Cheers, > > Rong > > > > [1] > > > https://github.com/logstash/logstash-logback-encoder/blob/master/src/test/resources/logback-test.xml#L13 > > [2] https://github.com/logstash/logstash-logback

[DISCUSS] Improve Flink logging with contextual information

2019-10-03 Thread Gyula Fóra
Hi all! We have been thinking that it would be a great improvement to add contextual information to the Flink logs: - Container / yarn / host info to JM/TM logs - Job info (job id/ jobname) to task logs I this should be similar to how the metric scopes are set up and should be able to provide

Re: [DISCUSS] Releasing Flink 1.9.1

2019-09-24 Thread Gyula Fóra
+1 for 1.9.1 soon :) I would also like to include a fix to: FLINK-14145 - getLatestCheckpoint(true) returns wrong checkpoint It is already merged to master and just need to merge it to 1.9 if we all agree (https://github.com/apache/flink/pull/9756) Cheers, Gyula On Tue, Sep 24, 2019 at 8:23 AM

Re: [DISCUSS] Adding configure and close methods to the Kafka(De)SerializationSchema interfaces

2019-09-05 Thread Gyula Fóra
org/apache/flink/formats/avro/registry/confluent/ConfluentRegistryAvroDeserializationSchema.java > > On Thu, Sep 5, 2019 at 9:43 AM Gyula Fóra wrote: > > > Hi all! > > > > While implementing a new custom flink serialization schema that wraps an > > existing Kafka serializer, I realized we are missing 2 key met

[DISCUSS] Adding configure and close methods to the Kafka(De)SerializationSchema interfaces

2019-09-05 Thread Gyula Fóra
Hi all! While implementing a new custom flink serialization schema that wraps an existing Kafka serializer, I realized we are missing 2 key methods that could be easily added: void configure(java.util.Map configs); void close(); We could rename configure to open but Kafka serializers have a

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

2019-08-30 Thread Gyula Fóra
can provide a thin wrapper around Configuration for e.g. filtering > certain logic, changing values based on other parameters etc. Is that > what you had in mind? > > Best, > > Dawid > > On 29/08/2019 19:21, Gyula Fóra wrote: > > Hi! > > > > Huuuge +1 from me, this

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

2019-08-29 Thread Gyula Fóra
after the default behaviour that you outlined in the FLIP. What do you think? Gyula On Thu, Aug 29, 2019 at 7:21 PM Gyula Fóra wrote: > Hi! > > Huuuge +1 from me, this has been an operational pain for years. > This would also introduce a nice and simple way to extend it in the fu

Re: [DISCUSS] FLIP-59: Enable execution configuration from Configuration object

2019-08-29 Thread Gyula Fóra
Hi! Huuuge +1 from me, this has been an operational pain for years. This would also introduce a nice and simple way to extend it in the future if we need. Ship it! Gyula On Thu, Aug 29, 2019 at 5:05 PM Dawid Wysakowicz wrote: > Hi, > > I wanted to propose a new, additional way of configuring

Re: [CODE-STYLE] Builder pattern

2019-08-29 Thread Gyula Fóra
>>>> > >>>>>> On 26 Aug 2019, at 15:12, Jark Wu wrote: > >>>>>> > >>>>>> Hi Gyula, > >>>>>> > >>>>>> Thanks for bringing this. I think it would be nice if we have a > common > >>&

[CODE-STYLE] Builder pattern

2019-08-26 Thread Gyula Fóra
Hi All! I would like to start a code-style related discussion regarding how we implement the builder pattern in the Flink project. It would be the best to have a common approach, there are some aspects of the pattern that come to my mind please feel free to add anything I missed: 1. Creating

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-16 Thread Gyula Fóra
>>>>>>>>>> of > > > > > > > > > >>>>>>>>>>>> flink-dist. I've created > > > > > > > > > >>>>>>>>>>> > > https://issues.apache.org/jira/b

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-12 Thread Gyula Fóra
at is the recommended way these days. > > Best, > Stephan > > > > On Mon, Aug 12, 2019 at 10:48 AM Gyula Fóra wrote: > > > Thanks Dawid, > > > > In the meantime I also figured out that I need to build the > > https://github.com/apache/flink-shaded

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-12 Thread Gyula Fóra
. > > Best, > > Dawid > > On 11/08/2019 19:31, Gyula Fóra wrote: > > Hi again, > > > > How do I build the RC locally with the hadoop version specified? Seems > like > > no matter what I do I run into dependency problems with the shaded hadoop > >

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-11 Thread Gyula Fóra
appreciate any pointers :) Thanks! Gyula On Sun, Aug 11, 2019 at 6:57 PM Gyula Fóra wrote: > Hi! > > I am trying to build 1.9.0-rc2 with the -Pvendor-repos profile enabled. I > get the following error: > > mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.6.0 > -Pi

Re: [VOTE] Apache Flink Release 1.9.0, release candidate #2

2019-08-11 Thread Gyula Fóra
Hi! I am trying to build 1.9.0-rc2 with the -Pvendor-repos profile enabled. I get the following error: mvn clean install -DskipTests -Pvendor-repos -Dhadoop.version=2.6.0 -Pinclude-hadoop (ignore that the hadoop version is not a vendor hadoop version) [ERROR] Failed to execute goal on project

Re: List of consumed kafka topics should not be restored from state

2019-02-15 Thread Gyula Fóra
sues.apache.org/jira/browse/FLINK-10342> > > > - https://issues.apache.org/jira/browse/FLINK-9303 < > > > https://issues.apache.org/jira/browse/FLINK-9303> > > > > > > The second one only because it’s slightly related. The first one is > > &

Re: List of consumed kafka topics should not be restored from state

2019-02-13 Thread Gyula Fóra
his can only be modified when > > restoring from savepoints (i.e. manual restores). > > To avoid breaking the current behaviour, we can maybe add a > > `filterRestoredPartitionOffsetState()` configuration on the consumer, > which > > by default is disabled to match the cur

List of consumed kafka topics should not be restored from state

2019-02-13 Thread Gyula Fóra
Hi! I have run into a weird issue which I could have sworn that it wouldnt happen :D I feel there was a discussion about this in the past but maybe im wrong, but I hope someone can point me to a ticket. Lets say you create a kafka consumer that consumes (t1,t2,t3), you take a savepoint and

Re: [DISCUSSION] Complete restart after successive failures

2019-01-04 Thread Gyula Fóra
; wrote: > > > >> Hi Gyula, > >> > >> Personally I do not see a problem with providing such an option of > “clean > >> restart” after N failures, especially if we set the default value for N > to > >> +infinity. However guys working more wi

[DISCUSSION] Complete restart after successive failures

2018-12-29 Thread Gyula Fóra
Hi all! In the past years while running Flink in production we have seen a huge number of scenarios when the Flink jobs can go into unrecoverable failure loops and only a complete manual restart helps. This is in most cases due to memory leaks in the user program, leaking threads etc and it

Re: Hbase state backend in Flink

2018-12-27 Thread Gyula Fóra
Hi! While certainly possible I think it’s a bad idea in general. I think state size itself shouldn’t be a problem with the RocksDb backend as you can always increase parallelism to shard more while keeping the insanely good performance compared to a remote kv store. We and other users have

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-22 Thread Gyula Fóra
ate bootstrapping problem. > Could you please elaborate on how we can leverage the tooling to solve > state bootstrapping? I think this is a common problem to stream processing, > and it will be great the community can work on it. Thanks. > > Shuyi > > On Wed, Aug 22, 2018 at 11

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-22 Thread Gyula Fóra
10:21 AM Aljoscha Krettek > > wrote: > > > > > +1 I'd like to have something like this in Flink a lot! > > > > > > > On 19. Aug 2018, at 11:57, Gyula Fóra wrote: > > > > > > > > Hi all! > > > > > >

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-19 Thread Gyula Fóra
, Gyula Paris Carbone ezt írta (időpont: 2018. aug. 18., Szo, 9:03): > +1 > > Might also be a good start to implement queryable stream state with > snapshot isolation using that mechanism. > > Paris > > > On 17 Aug 2018, at 12:28, Gyula Fóra wrote: > > > > H

Re: [Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-17 Thread Gyula Fóra
specially in cases when you are not sure > what you are looking for in the state. > > Just to clarify. Is your end goal to contribute such tool to apache Flink > or do you want it to be separate tool? > > Piotrek > > > On 17 Aug 2018, at 12:28, Gyula Fóra wrote: >

[Proposal] Utilities for reading, transforming and creating Streaming savepoints

2018-08-17 Thread Gyula Fóra
Hi All! I want to share with you a little project we have been working on at King (with some help from some dataArtisans folks). I think this would be a valuable addition to Flink and solve a bunch of outstanding production use-cases and headaches around state bootstrapping and state analytics.

Re: Yarn deployment takes long on some networks

2018-03-16 Thread Gyula Fóra
ary you recently also did stuff with YARN, do you maybe have an > idea of what could be going on? > > Best, > Aljoscha > > > On 21. Nov 2017, at 06:42, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > > Hi all! > > > > Today we started noticing t

Re: Serious memory leak in DefaultOperatorStateBackend

2018-02-22 Thread Gyula Fóra
zy and eager state IMO. > > Best, > Stefan > > > Am 22.02.2018 um 11:10 schrieb Gyula Fóra <gyula.f...@gmail.com>: > > > > Hi all, > > > > We have discovered a fairly serious memory leak > > in DefaultOperatorStateBackend, with broadcast (

Serious memory leak in DefaultOperatorStateBackend

2018-02-22 Thread Gyula Fóra
Hi all, We have discovered a fairly serious memory leak in DefaultOperatorStateBackend, with broadcast (union) list states. The problem seems to occur when a broadcast state name is changed, in order to drop some state (intentionally). Flink does not drop the "garbage" broadcast state, and

Re: [VOTE] Release 1.4.0, release candidate #2

2017-11-29 Thread Gyula Fóra
ight now, I'm assessing how much we need to fix to support Hadoop 2.9.0. > > I'll report later. > > > > Best, Fabian > > > > 2017-11-29 11:16 GMT+01:00 Aljoscha Krettek <aljos...@apache.org>: > > > >> Agreed, this is a regression compared to the prev

Re: [VOTE] Release 1.4.0, release candidate #2

2017-11-29 Thread Gyula Fóra
oc) > > > > > >> On 28. Nov 2017, at 11:20, Aljoscha Krettek <aljos...@apache.org> > wrote: > >> > >> Phew, thanks for the update! > >> > >>> On 28. Nov 2017, at 11:19, Gyula Fóra <gyf...@apache.org> wrote: >

Re: [VOTE] Release 1.4.0, release candidate #2

2017-11-28 Thread Gyula Fóra
Ok seems like I had to remove the snappy jar as it was corrupted (makes total sense) :P Gyula Fóra <gyf...@apache.org> ezt írta (időpont: 2017. nov. 28., K, 11:13): > Hi Aljoscha, > > Thanks for the release candidate. I am having a hard time building the rc, > I seem to get th

Yarn deployment takes long on some networks

2017-11-21 Thread Gyula Fóra
Hi all! Today we started noticing that deploying our jobs took over 3 minutes when deployed from some machine and normal (few seconds) when deployed from the others. Looking at the logs it seems that the client cant find some job id for a few minutes in this case: ... 2017-11-21 15:23:00,880

Re: Zookeeper failure handling

2017-09-29 Thread Gyula Fóra
ould be the recovery of >>> running jobs [3]. With that the TMs could continue executing the jobs even >>> if there is no leader anymore. The new leader (which could be the same JM), >>> would then recover the jobs from the TMs without having to restart them. >>>

Re: Zookeeper failure handling

2017-09-27 Thread Gyula Fóra
owse/FLINK-6174 > [2] https://github.com/apache/flink/pull/3599 > [3] https://issues.apache.org/jira/browse/FLINK-5703 > > Cheers, > Till > > On Wed, Sep 27, 2017 at 10:58 AM, Gyula Fóra <gyula.f...@gmail.com> wrote: > > > Hi Till, > > Thanks for the e

Re: Zookeeper failure handling

2017-09-27 Thread Gyula Fóra
mediately revoking the leadership if one can guarantee that there will > never be two JMs competing for the leadership. However, in the general > case, this should be hard to do. > > Cheers, > Till > > On Wed, Sep 27, 2017 at 9:22 AM, Gyula Fóra <gyula.f...@gmail.com> wrote: > >&

<    1   2   3   4   5   6   7   >