Re: Job gets stuck when using kafka transactions and eventually crashes

2023-01-25 Thread Vishal Surana
link-docs-master/docs/connectors/datastream/kafka/#idleness > for more details on this. > > Best regards, > > Martijn > > > Op ma 23 jan. 2023 om 10:53 schreef Vishal Surana : > > > Could it be that link is unable to commit offsets to Kafka? I know that > > >

Re: Job gets stuck when using kafka transactions and eventually crashes

2023-01-23 Thread Vishal Surana
. But with Kafka transactions enabled, the commit of offset is now required to happen. Thanks, Vishal On 23 Jan 2023 at 12:18 PM +0530, Vishal Surana , wrote: > My job runs fine when running without kafka transactions. The source and sink > are kafka in my job with a couple of RocksDB based stateful ope

Job gets stuck when using kafka transactions and eventually crashes

2023-01-22 Thread Vishal Surana
My job runs fine when running without kafka transactions. The source and sink are kafka in my job with a couple of RocksDB based stateful operators taking 100GB each. When I enable kafka transactions, things go well initially and we can see high throughput as well. However, after a few hours, the

Re: Kafka transactioins & flink checkpoints

2022-11-16 Thread Vishal Surana
; Reducing the checkpoint interval is not really an option given the size of > > the checkpoint > > Do you use RocksDB state backend with incremental checkpointing? > > > On Tue, Nov 15, 2022 at 12:07 AM Vishal Surana wrote: > > > I wanted to achieve exactly on

Kafka transactioins & flink checkpoints

2022-11-15 Thread Vishal Surana
I wanted to achieve exactly once semantics in my job and wanted to make sure I understood the current behaviour correctly: 1. Only one Kafka transaction at a time (no concurrent checkpoints) 2. Only one transaction per checkpoint My job has very large amount of state (>100GB) and I have no

When should we use flink-json instead of Jackson directly?

2022-10-28 Thread Vishal Surana
I've been using Jackson to deserialize JSON messages into Scala classes and Java POJOs. The object mapper is heavily customized for our use cases. It seems that flink-json internally uses Jackson as well and allows for injecting our own mappers. Would there be any benefit of using flink-json

Re: ContinuousFileMonitoringFunction retrieved invalid state.

2022-07-04 Thread Vishal Surana
, it seems >> try to restore from an invalid state. Seems the state actually contains >> more that one value, but Flink expected the state should contains one or >> zero value. >> >> Best regards, >> Yuxia >> >> -- >

Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Vishal Surana
In my load tests, I've found FIFO compaction to offer the best performance as my job needs state only for so long. However, this particular statement in RocksDB documentation concerns me: "Since we never rewrite the key-value pair, we also don't ever apply the compaction filter on the keys."

ContinuousFileMonitoringFunction retrieved invalid state.

2022-06-30 Thread Vishal Surana
My job is unable to restore state after savepoint due to the following exception. Seems to be a rare exception as I haven't found any forum discussing it. Please advise. java.lang.IllegalArgumentException: ContinuousFileMonitoringFunction retrieved invalid state. at

Re: Optimizing parallelism in reactive mode with adaptive scaling

2022-06-29 Thread Vishal Surana
. > > > [1]https://flink.apache.org/2021/05/06/reactive-mode.html > > Best, > Weihua > > > On Wed, Jun 29, 2022 at 8:56 PM Vishal Surana wrote: > >> I have a job which has about 10 operators, 3 of which are heavy weight. I >> understand that the curre

Optimizing parallelism in reactive mode with adaptive scaling

2022-06-29 Thread Vishal Surana
I have a job which has about 10 operators, 3 of which are heavy weight. I understand that the current implementation of autoscaling gives more or less no configurability besides max parallelism. That is practically useless as the operators I have will inevitably choke if one of the 3 ends up with

Re: Broadcast State + Stateful Operator + Async IO

2022-04-29 Thread Vishal Surana
Thanks a lot for your quick response! Your suggestion however would never work for our use case. Ours is a streaming system that must process 100 thousand messages per second and produce immediate results and it's simply impossible to rerun the job. Our job is a streaming job broken down into

Re: Broadcast State + Stateful Operator + Async IO

2022-04-29 Thread Vishal Surana
ally impossible. > As for whether there are other solutions, it may depend on specific > scenarios, such as what kind of external system. So could you describe in > detail what scenario has this requirement, and what are the external > systems it depends on? > > Best, > Guowe

Broadcast State + Stateful Operator + Async IO

2022-04-28 Thread Vishal Surana
Hello, My application has a stateful operator which leverages RocksDB to store a large amount of state. It, along with other operators receive configuration as a broadcast stream (KeyedBroadcastProcessFunction). The operator depends upon another input stream that triggers some communication with

Disabling autogenerated uid/hash doesn't work when using file source

2021-08-25 Thread Vishal Surana
I set names and uid for all my flink operators and have explicitly disabled auto generation of uid to force developers in my team the same practice. However, when using a file source, there's no option of providing it due to which the job fails to start unless we enable auto generation. Am I

Protobuf + Confluent Schema Registry support

2021-06-30 Thread Vishal Surana
Using the vanilla kafka producer, I can write protobuf messages to kafka while leveraging schema registry support as well. A flink kafka producer requires us to explicity provide a serializer which converts the message to a producerrecord containing the serialized bytes of the message. We can't