Re: Working with protobuf wrappers

2015-12-02 Thread Krzysztof Zarzycki
s.apache.org/jira/browse/FLINK-1635), we removed it >>>> again. Now you have to register the ProtobufSerializer manually: >>>> https://ci.apache.org/projects/flink/flink-docs-master/apis/best_practices.html#register-a-custom-serializer-for-your-flink-program >>>

Working with protobuf wrappers

2015-11-30 Thread Krzysztof Zarzycki
Hi! I'm trying to use generated Protobuf wrappers compiled with protoc and pass them as objects between functions of Flink. I'm using Flink 0.10.0. Unfortunately, I get an exception on runtime: [...] Caused by: com.esotericsoftware.kryo.KryoException: java.lang.UnsupportedOperationException

RocksDB state checkpointing is expensive?

2016-04-07 Thread Krzysztof Zarzycki
Hi, I saw the documentation and source code of the state management with RocksDB and before I use it, I'm concerned of one thing: Am I right that currently when state is being checkpointed, the whole RocksDB state is snapshotted? There is no incremental, diff snapshotting, is it? If so, this seems

Re: RocksDB state checkpointing is expensive?

2016-04-07 Thread Krzysztof Zarzycki
checkpoint, we essentially copy the whole RocksDB > database to HDFS (or whatever filesystem you chose as a backup location). > As far as I know, Stephan will start working on adding support for > incremental snapshots this week or next week. > > Cheers, > Aljoscha > > On Thu,

Re: Does Kafka connector leverage Kafka message keys?

2016-05-12 Thread Krzysztof Zarzycki
If I can throw in my 2 cents, I agree with what Elias says. Without that feature (not partitioning already partitioned Kafka data), Flink is in bad position for common simpler processing, that don't involve shuffling at all, for example simple readKafka-enrich-writeKafka . The systems like the new

multi-application correlated savepoints

2016-05-10 Thread Krzysztof Zarzycki
Hi! I'm thinking about using a great Flink functionality - savepoints . I would like to be able to stop my streaming application, rollback the state of it and restart it (for example to update code, to fix a bug). Let's say I would like travel back in time and reprocess some data. But what if I

Re: Join a datastream with tables stored in Hive

2019-12-13 Thread Krzysztof Zarzycki
support look up, >> like HBase. >> >> Both solutions are not ideal now, and we also aims to improve this maybe >> in the following >> release. >> >> Best, >> Kurt >> >> >> On Fri, Dec 13, 2019 at 1:44 AM Krzysztof Zarzycki >> w

Join a datastream with tables stored in Hive

2019-12-12 Thread Krzysztof Zarzycki
Hello dear Flinkers, If this kind of question was asked on the groups, I'm sorry for a duplicate. Feel free to just point me to the thread. I have to solve a probably pretty common case of joining a datastream to a dataset. Let's say I have the following setup: * I have a high pace stream of

Re: Join a datastream with tables stored in Hive

2019-12-16 Thread Krzysztof Zarzycki
ing maybe in 1 or 2 releases. > > Best, > Kurt > > [1] > https://ci.apache.org/projects/flink/flink-docs-master/dev/table/streaming/query_configuration.html#idle-state-retention-time > > > On Sat, Dec 14, 2019 at 3:41 AM Krzysztof Zarzycki > wrote: > >> V

Re: Dynamic Flink SQL

2020-04-07 Thread Krzysztof Zarzycki
ch is significantly better justifying the work by us, maybe also the community, on overcoming these difficulties. > > thanks, > > maciek > > > > On 27/03/2020 10:18, Krzysztof Zarzycki wrote: > > I want to do a bit different hacky PoC: > * I will write a sink, that cach

Re: Dynamic Flink SQL

2020-03-25 Thread Krzysztof Zarzycki
resulting code would be minimal > and easy to maintain. If the performance is not satisfying, you can always > make it more complicated. > > Best, > > Arvid > > > On Mon, Mar 23, 2020 at 7:02 PM Krzysztof Zarzycki > wrote: > >> Dear Flink community! >> >&

Re: Dynamic Flink SQL

2020-03-27 Thread Krzysztof Zarzycki
discussed above. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream [2] https://flink.apache.org/news/2020/03/24/demo-fraud-detection-2.html > On Wed, Mar 25, 2020 at 6:15 PM Krzysztof Zarz

Complex graph-based sessionization (potential use for stateful functions)

2020-03-30 Thread Krzysztof Zarzycki
Hi! Interesting problem to solve ahead :) I need to implement a streaming sessionization algorithm (split stream of events into groups of correlated events). It's pretty non-standard as we DON'T have a key like user id which separates the stream into substreams which we just need to chunk based

Dynamic Flink SQL

2020-03-23 Thread Krzysztof Zarzycki
Dear Flink community! In our company we have implemented a system that realize the dynamic business rules pattern. We spoke about it during Flink Forward 2019 https://www.youtube.com/watch?v=CyrQ5B0exqU. The system is a great success and we would like to improve it. Let me shortly mention what

Multi-stream SQL-like processing

2020-11-02 Thread Krzysztof Zarzycki
Hi community, I would like to confront one idea with you. I was thinking that Flink SQL could be a Flink's answer for Kafka Connect (more powerful, with advantages like being decoupled from Kafka). Flink SQL would be the configuration language for Flink "connectors", sounds great!. But one thing

Re: Multi-stream SQL-like processing

2020-11-05 Thread Krzysztof Zarzycki
(a): > Yes. The dynamism might be a problem. > Does Kafka Connect support discovering new tables and synchronizing them > dynamically? > > Best, > Jark > > On Thu, 5 Nov 2020 at 04:39, Krzysztof Zarzycki > wrote: > >> Hi Jark, thanks for joining the discussion