er case, you properly want to annotate them with
> Forward Field annoations.
>
> The number of source tasks is unrelated to the number of splits. If you
> have more tasks than splits, some tasks won't process any data.
>
> Best, Fabian
>
> [1]
> https://ci.apache.org/projects/fli
Hi everyone,
I have the following scenario: I have a database table with 3 columns: a
host (string), a timestamp, and an integer ID. Conceptually, what I'd like
to do is:
group by host and timestamp -> based on all the IDs in each group, create a
mapping to n new tuples -> for each unique tuple,
n
> (ExecutionEnvironment.getExecutionPlan()) using the plan visualizer [1] and
> validate that the result are identical whether you use SDP or not.
>
> Best, Fabian
>
> [1] https://flink.apache.org/visualizer/
>
> 2018-08-07 22:32 GMT+02:00 Alexis Sarda :
>
>> H
, even though a groupBy
was used.
- The third and final subtask is a sink that saves back to the database.
Does anyone know why parallelism is not being used?
Regards,
Alexis.
On Thu, Aug 9, 2018 at 11:22 AM Alexis Sarda wrote:
> Hi Fabian,
>
> Thanks a lot for the help. The scala Da
hare the plan for the program?
>
> Are you sure that more than 1 split is generated by the JdbcInputFormat?
>
> 2018-08-10 12:04 GMT+02:00 Alexis Sarda :
>
>> It seems I may have spoken too soon. After executing the job with more
>> data, I can see the following things i
Hello everyone,
I just experienced something weird and I'd like to know if anyone has any idea
of what could have happened.
I have a simple Flink cluster version 1.11.3 running on Kubernetes with a
single worker.
I was testing a pipeline that connects 2 keyed streams and processes the result
Hello,
I am currently testing a scenario where I would run the same job multiple times
in a loop with different inputs each time. I am testing with a local Flink
cluster v1.12.4. I initially got an OOM - Metaspace error, so I increased the
corresponding memory in the TM's JVM (to 512m), but it
cleaned up? In this
scenario only my jobs would be running in the cluster, so I can have a bit more
control.
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Thursday, July 8, 2021 12:14 AM
To: user@flink.apache.org
Subject: OOM Metaspace after multiple jobs
Alexis.
From: Roman Khachatryan
Sent: Friday, July 2, 2021 9:19 PM
To: Alexis Sarda-Espinosa ; Yang Wang
Cc: user@flink.apache.org
Subject: Re: Using Flink's Kubernetes API inside Java
Hi Alexis,
Have you looked at flink-on-k8s-operator [1]?
It seems t
Thanks Roman and Yang, I understand. I’ll have a look and ask on the developer
list depending on what I find.
Regards,
Alexis.
From: Yang Wang
Sent: Mittwoch, 7. Juli 2021 05:14
To: ro...@apache.org
Cc: Alexis Sarda-Espinosa ;
user@flink.apache.org
Subject: Re: Using Flink's Kubernetes API
Hello everyone,
I'm testing a custom Kubernetes operator that should fulfill some specific
requirements I have for Flink. I know of this WIP project:
https://github.com/wangyang0918/flink-native-k8s-operator
I can see that it uses some classes that aren't publicly documented, and I
believe it
, March 12, 2021 6:22 PM
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: [Schema Evolution] Cannot restore from savepoint after deleting
field from POJO
Hi Alexis,
This looks like a bug, I've created a Jira ticket to address it [1].
Please feel free to provide any additional
Hello,
Regarding the new BATCH mode of the data stream API, I see that the
documentation states that some operators will process all data for a given key
before moving on to the next one. However, I don't see how Flink is supposed to
know whether the input will provide all data for a given key
Hi everyone,
It seems I'm having either the same problem, or a problem similar to the one
mentioned here:
http://apache-flink-user-mailing-list-archive.2336050.n4.nabble.com/Problem-when-restoring-from-savepoint-with-missing-state-amp-POJO-modification-td39945.html
I have a POJO class that is
: Dawid Wysakowicz
Sent: Friday, March 12, 2021 4:10 PM
To: Alexis Sarda-Espinosa; user@flink.apache.org
Subject: Re: DataStream in batch mode - handling (un)ordered bounded data
Hi Alexis,
As of now there is no such feature in the DataStream API. The Batch mode in
DataStream API is a new feature
Hello,
I have a Flink TM configured with taskmanager.memory.managed.size: 1372m. There
is a streaming job using RocksDB for checkpoints, so I assume some of this
memory will indeed be used.
I was looking at the metrics exposed through the REST interface, and I queried
some of them:
:30
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Clarification about Flink's managed memory and metric monitoring
Hi Alexis,
First of all, I strongly recommend not to look into the JVM metrics. These
metrics are fetched directly from JVM and do not well correspond to Flink's
I think it would be nice if the task manager pods get their values from the
configuration file only if the pod templates don’t specify any resources. That
was the goal of supporting pod templates, right? Allowing more custom scenarios
without letting the configuration options get bloated.
Hello,
I am trying to configure TLS communication for a Flink cluster running on
Kubernetes. I am currently using the BCFKS format and setting that as default
via javax.net.ssl.keystoretype and javax.net.ssl.truststoretype (which are
injected in the environment variable FLINK_ENV_JAVA_OPTS).
; Alexis Sarda-Espinosa
; matth...@ververica.com;
user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi Yang,
I agree with you, but I think the limit-factor should be greater than or equal
to 1, and default to 1 is a better
I'm fairly certain you need the curly braces surrounding the variable, the
substitution is not done by the shell, it's just similar syntax (as mentioned
in the doc
http://logback.qos.ch/manual/configuration.html#variableSubstitution).
Chapter 3: Logback configuration -
. September 2021 08:09
To: Alexis Sarda-Espinosa
Cc: spoon_lz ; Denis Cosmin NUTIU ;
matth...@ververica.com; user@flink.apache.org
Subject: Re: Deploying Flink on Kubernetes with fractional CPU and different
limits and requests
Hi Alexis
Thanks for your valuable inputs.
First, I want to share
I'm not very knowledgeable when it comes to Linux memory management, but do
note that Linux (and by extension Kubernetes) takes disk IO into account when
deciding whether a process is using more memory than it's allowed to, see e.g.
Someone please correct me if I’m wrong but, until FLINK-16686 [1] is fixed, a
class must be a POJO to be used in managed state with RocksDB, right? That’s
not to say that the approach with TypeInfoFactory won’t work, just that even
then it will mean none of the data classes can be used for
Hi everyone,
I had been using the DataSet API until now, but since that's been deprecated, I
started looking into the Table API. In my DataSet job I have a lot of POJOs,
all of which are even annotated with @TypeInfo and provide the corresponding
factories. The Table API documentation talks
,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Mittwoch, 20. Oktober 2021 09:43
To: Parag Somani ; Caizhi Weng
Cc: Flink ML
Subject: RE: Troubleshooting checkpoint timeout
Currently the windows are 10 minutes in size with a 1-minute slide time. The
approximate 500 event/minute throughput is already
Hello everyone,
I am doing performance tests for one of our streaming applications and, after
increasing the throughput a bit (~500 events per minute), it has started
failing because checkpoints cannot be completed within 10 minutes. The Flink
cluster is not exactly under my control and is
(keySelector)
.connect(DataStreamUtils.reinterpretAsKeyedStream(openedEventsTimestamped,
keySelector))
.process(...)
Could this lead to delays or alignment issues?
Regards,
Alexis.
From: Parag Somani
Sent: Mittwoch, 20. Oktober 2021 09:22
To: Caizhi Weng
Cc: Alexis Sarda-Espinosa
,
Alexis.
From: Yang Wang
Sent: Friday, October 8, 2021 5:24 AM
To: Alexis Sarda-Espinosa
Cc: Flink ML
Subject: Re: Kubernetes HA - Reusing storage dir for different clusters
When the Flink job reached to global terminal state(FAILED, CANCELED,
FINISHED), all the HA
Hello,
If I deploy a Flink-Kubernetes application with HA, I need to set
high-availability.storageDir. If my application is a batch job that may run
multiple times with the same configuration, do I need to manually clean up the
storage dir between each execution?
Regards,
Alexis.
eam operator has lower parallelism?
Regards,
Alexis.
From: Piotr Nowojski
Sent: Montag, 25. Oktober 2021 09:59
To: Alexis Sarda-Espinosa
Cc: Parag Somani ; Caizhi Weng ;
Flink ML
Subject: Re: Troubleshooting checkpoint timeout
Hi Alexis,
You can read about those metrics in the documentation
Hi everyone,
Based on what I know, a single operator with parallelism > 1 checks the
watermarks from all its streams and uses the smallest one out of the non-idle
streams. My first question is whether watermarks are forwarded as long as a
different watermark strategy is not applied downstream?
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Donnerstag, 5. August 2021 15:49
To: user@flink.apache.org
Subject: Using POJOs with the table API
Hi everyone,
I had been using the DataSet API until now, but since that's been deprecated, I
started looking into the Table API. In my DataSet job I
checkpoints. Thanks again for all the info.
Regards,
Alexis.
From: Piotr Nowojski
Sent: Montag, 25. Oktober 2021 15:51
To: Alexis Sarda-Espinosa
Cc: Parag Somani ; Caizhi Weng ;
Flink ML
Subject: Re: Troubleshooting checkpoint timeout
Hi Alexis,
> Should I understand these metrics as a prope
be that the new checkpoint barriers created after the first restart are
behind more data than before it restarted, no?
Regards,
Alexis.
From: Piotr Nowojski
Sent: Montag, 25. Oktober 2021 13:35
To: Alexis Sarda-Espinosa
Cc: Parag Somani ; Caizhi Weng ;
Flink ML
Subject: Re: Troubleshooting checkpoint
://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/operators/windows/#consecutive-windowed-operations
Regards,
Alexis.
From: David Morávek
Sent: Donnerstag, 2. Dezember 2021 17:26
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Watermark behavior when connecting streams
. Dezember 2021 17:18
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Buffering when connecting streams
Even with the TwoInputStreamOperator you can not "halt" the processing. You
need to buffer these elements for example in the ListState for later
processing. At th
ávek
Sent: Donnerstag, 2. Dezember 2021 16:59
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Buffering when connecting streams
I think this would require using lower level API and implementing a custom
`TwoInputStreamOperator`. Then you can hook to `processWatemark{1,2}` metho
reaches
processElement1, even when considering watermarks.
Regards,
Alexis.
From: David Morávek
Sent: Donnerstag, 2. Dezember 2021 15:45
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Buffering when connecting streams
You can not rely on order of the two streams that easily
Hello,
I have a use case with event-time processing that ends up with a DAG roughly
like this:
source -> filter -> keyBy -> watermark -> window -> process -> keyBy_2 ->
connect (KeyedCoProcessFunction) -> sink
| /
n
second 17, and my windows are evaluated every minute, so it wasn’t a race
condition.
Regards,
Alexis.
From: David Morávek
Sent: Donnerstag, 2. Dezember 2021 14:52
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Buffering when connecting streams
Hi Alexis,
I'm not sure w
per job, it seems) of some of my job's classes
(e.g. sources), and their GC roots were the Flink User Class Loader. I haven't
figured out why they would remain across different jobs.
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Thursday, July 8, 2021 12:51
For completeness, this still happens with Flink 1.14.4
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Friday, March 11, 2022 12:21 AM
To: user@flink.apache.org
Cc: pnowoj...@apache.org
Subject: Re: Interval join operator is not forwarding watermarks
Hello,
I'm in the process of updating from Flink 1.11.3 to 1.14.3, and it seems the
interval join in my pipeline is no longer working. More specifically, I have a
sliding window after the interval join, and the window isn't firing. After many
tests, I ended up creating a custom operator that
I found [1] and [2], which are closed, but could be related?
[1] https://issues.apache.org/jira/browse/FLINK-23698
[2] https://issues.apache.org/jira/browse/FLINK-18934
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Donnerstag, 10. März 2022 19:27
To: user@flink.apache.org
Subject
] https://github.com/asardaes/flink-interval-join-test
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Thursday, March 10, 2022 7:47 PM
To: user@flink.apache.org
Cc: pnowoj...@apache.org
Subject: RE: Interval join operator is not forwarding watermarks
Hi everyone,
I have some questions regarding max parallelism and how interacts with
deployment modes. The documentation states that max parallelism should be "set
on a per-job and per-operator granularity" but doesn't provide more details. Is
it possible to have different values of max
handling multiple streams that need joins and so on.
What do you think?
Regards,
Alexis.
From: Robert Metzger
Sent: Freitag, 28. Januar 2022 14:49
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Determinism of interval joins
Instead of using `reinterpretAsKeyedStream` can you
{
if (mark.getTimestamp() > maxTimestamp2) {
maxTimestamp2 = mark.getTimestamp();
}
maybeProcessWatermark2(mark, maxTimestamp1, maxTimestamp1);
}
}
Regards,
Alexis.
From: Alexis Sarda-Espinosa
Sent: Samstag, 29. Januar 2022 13:47
To: Robert Metzger
Cc: user@
Hello,
Happened to me too, here’s the JIRA ticket:
https://issues.apache.org/jira/browse/FLINK-21752
Regards,
Alexis.
From: bastien dine
Sent: Mittwoch, 2. Februar 2022 16:01
To: user
Subject: Pojo State Migration - NPE with field deletion
Hello,
I have some trouble restoring a state
rks should be deterministic,
the input file is sorted, and the watermark strategies should essentially
behave like the monotonous generator.
[1] https://issues.apache.org/jira/browse/FLINK-24466
Regards,
Alexis.
____
From: Alexis Sarda-Espinosa
Sent: Thursday, January 27, 20
Hi everyone,
I'm seeing a lack of determinism in unit tests when using an interval join. I
am testing with both Flink 1.11.3 and 1.14.3 and the relevant bits of my
pipeline look a bit like this:
keySelector1 = ...
keySelector2 = ...
rightStream = stream1
.flatMap(...)
.keyBy(keySelector1)
in a (Process)JoinFunction? The join needs
keys, but I don’t know if the resulting stream counts as keyed from the state’s
point of view.
Regards,
Alexis.
From: Piotr Nowojski
Sent: Montag, 6. Dezember 2021 08:43
To: David Morávek
Cc: Alexis Sarda-Espinosa ;
user@flink.apache.org
Subject: Re
mean an operator is a bad
idea, it's just something that other users might want to keep in mind.
Regards,
Alexis.
From: Robert Metzger
Sent: Thursday, January 20, 2022 7:06 PM
To: Alexis Sarda-Espinosa
Cc: dev ; user
Subject: Re: Flink native k8s integration vs
in the state as well.
Regards,
Alexis.
-Original Message-
From: Roman Khachatryan
Sent: Dienstag, 12. April 2022 14:06
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: RocksDB's state size discrepancy with what's seen with state
processor API
/shared folder contains ke
rom: Roman Khachatryan
Sent: Dienstag, 12. April 2022 12:37
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: RocksDB's state size discrepancy with what's seen with state
processor API
Hi Alexis,
Thanks a lot for sharing this. I think the program is correct.
Although it doesn't t
oman Khachatryan mailto:ro...@apache.org>>
Sent: Friday, April 8, 2022 11:06 PM
To: Alexis Sarda-Espinosa
mailto:alexis.sarda-espin...@microfocus.com>>
Cc: user@flink.apache.org<mailto:user@flink.apache.org>
mailto:user@flink.apache.org>>
Subject: Re: RocksDB's state size
h
parallelism=4, this doesn't affect the result, does it?
Regards,
Alexis.
From: Roman Khachatryan
Sent: Friday, April 8, 2022 11:06 PM
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: RocksDB's state size discrepancy with what's seen with state
processo
Message-
From: Roman Khachatryan
Sent: Freitag, 8. April 2022 11:48
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: Using state processor API to read state defined with a TypeHint
Hi Alexis,
I think your setup is fine, but probably Java type erasure makes Flink consider
Hi everyone,
I have a ProcessWindowFunction that uses Global window state. It uses MapState
with a descriptor defined like this:
MapStateDescriptor> msd = new MapStateDescriptor<>(
"descriptorName",
TypeInformation.of(Long.class),
TypeInformation.of(new TypeHint>() {})
File State; that accounts for almost 1.5GB.
I believe that is one of the files RocksDB uses internally, but is that related
to managed state used by my functions? Or does that indicate size growth is
elsewhere?
Regards,
Alexis.
-Original Message-
From: Alexis Sarda-Espinosa
Sent
Hello,
According to the javadoc of TriggerResult.PURGE, "All elements in the
window are cleared and the window is discarded, without evaluating
the window function or emitting any elements."
However, I've noticed that using a GlobalWindow (with a custom trigger)
followed by an AggregateFunction
Hello,
Just a shot in the dark here, but could it be related to
https://issues.apache.org/jira/browse/FLINK-32241 ?
Such failures can cause many exceptions, but I think the ones you've
included aren't pointing to the root cause, so I'm not sure if that issue
applies to you.
Regards,
Alexis.
On
>
> PS: I’m currently working on this ticket in order to get some glitches
> removed: FLINK-26585 <https://issues.apache.org/jira/browse/FLINK-26585>
>
>
>
>
>
> *From:* Alexis Sarda-Espinosa
> *Sent:* Thursday, October 26, 2023 4:01 PM
> *To:* user
&
Hello,
The documentation of the state processor API has some examples to modify an
existing savepoint by defining a StateBootstrapTransformation. In all
cases, the entrypoint is OperatorTransformation#bootstrapWith, which
expects a DataStream. If I pass an empty DataStream to bootstrapWith and
Hello,
very quick question, the documentation for side outputs states that an
OutputTag "needs to be an anonymous inner class, so that we can analyze the
type" (this is written in a comment in the example). Is this really true?
I've seen many examples where it's a static element and it seems to
behavior. A ticket and PR could be
> added to fix the document. What do you think?
>
> Best,
> Yunfeng
>
> On Fri, Sep 22, 2023 at 4:55 PM Alexis Sarda-Espinosa
> wrote:
> >
> > Hello,
> >
> > very quick question, the documentation for side outputs state
Singh Lilhore <
surendralilh...@gmail.com>:
> Hi Alexis,
>
> Could you please check the TaskManager log for any exceptions?
>
> Thanks
> Surendra
>
>
> On Thu, Sep 28, 2023 at 7:06 AM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> He
> Ram
>
> On Thu, Sep 28, 2023 at 2:06 AM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hello,
>>
>> We are using ABFSS for RocksDB's backend as well as the storage dir
>> required for Kubernetes HA. In the Azure Portal's monitoring
dependent from each
> other and OutputTag's document is correct from this aspect.
>
> [1]
> https://github.com/apache/flink/blob/master/flink-core/src/main/java/org/apache/flink/util/OutputTag.java#L82
>
> Best,
> Yunfeng
>
> On Mon, Sep 25, 2023 at 10:57 PM Alexis Sarda-Espi
Hello,
We are using ABFSS for RocksDB's backend as well as the storage dir
required for Kubernetes HA. In the Azure Portal's monitoring insights I see
that every single operation contains failing transactions for the
GetPathStatus API. Unfortunately I don't see any additional details, but I
know
job , is it able to
> safely use the checkpoint and get back to the checkpointed state?
>
> Regards
> Ram,
>
> On Thu, Sep 28, 2023 at 4:46 PM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hi Surendra,
>>
>> there are no exceptions in the logs,
r classes this but unfortunately I have no
> idea how to use these classes or how they might be able to help me. This is
> all very new to me and I honestly can't wrap my head around Flink's type
> information system.
>
> Best regards,
> Saleh.
>
> On 14 Aug 2023, at 4:05 PM, Alex
b run and wait? I've been reading through RocksDB documentation as well,
but that might not be enough because I don't know how Flink handles its own
framework state internally.
Regards,
Alexis.
From: David Anderson
Sent: Friday, April 22, 2022 9:57 AM
To: Alexi
at 10:52 PM Alexis Sarda-Espinosa
wrote:
>
> I can look into RocksDB metrics, I need to configure Prometheus at some point
> anyway. However, going back to the original question, is there no way to gain
> more insight into this with the state processor API? You've mentioned
> p
ructors, i.e. before they are serialized to the TM, but the descriptors
are Serializable, so I imagine that's not relevant.
[1] https://stackoverflow.com/a/50510054/5793905
Regards,
Alexis.
-Original Message-
From: Roman Khachatryan
Sent: Dienstag, 19. April 2022 11:48
To: Alexis Sard
ds,
Alexis.
From: Roman Khachatryan
Sent: Tuesday, April 19, 2022 5:51 PM
To: Alexis Sarda-Espinosa
Cc: user@flink.apache.org
Subject: Re: RocksDB's state size discrepancy with what's seen with state
processor API
> I assume that when you say "new states",
May I ask if anyone tested RocksDB’s periodic compaction in the meantime? And
if yes, if it helped with this case.
Regards,
Alexis.
From: tao xiao
Sent: Samstag, 18. September 2021 05:01
To: David Morávek
Cc: Yun Tang ; user
Subject: Re: RocksDB state not cleaned up
Thanks for the feedback!
Hello,
I have a streaming job running on Flink 1.14.4 that uses managed state with
RocksDB with incremental checkpoints as backend. I've been monitoring a dev
environment that has been running for the last week and I noticed that state
size and end-to-end duration have been increasing
Hi Dawid,
Thanks for the update, I also managed to work around it by adding another
watermark assignment operator between the join and the window. I’ll have to see
if it’s possible to assign watermarks at the source, but even if it is, I worry
that the different partitions created by all my
. Seems like normal operations, so it's just unfortunate the
Azure API exposes that as continuous ClientOtherError metrics.
Regards,
Alexis.
Am Fr., 6. Okt. 2023 um 08:10 Uhr schrieb Alexis Sarda-Espinosa <
sarda.espin...@gmail.com>:
> Yes, that also works correctly, at least based on
Hi David,
Please refer to https://issues.apache.org/jira/browse/FLINK-21752
Regards,
Alexis.
-Original Message-
From: David Jost
Sent: Mittwoch, 18. Mai 2022 15:07
To: user@flink.apache.org
Subject: Schema Evolution of POJOs fails on Field Removal
Hi,
we currently have an issue,
\
--command-config kafka-preprod.properties \
--reset-offsets --to-datetime '2022-07-20T00:01:00.000' \
--execute
Regards,
Alexis.
Am Do., 14. Juli 2022 um 22:56 Uhr schrieb Alexis Sarda-Espinosa <
sarda.espin...@gmail.com>:
> Hi Yaroslav,
>
> The test I did was just using earl
n the job is initially started
> from a clear slate.
> Once checkpoints are involved it only relies on offsets stored in the
> state.
>
> On 20/07/2022 14:51, Alexis Sarda-Espinosa wrote:
>
> Hello again,
>
> I just performed a test
> using OffsetsInitializer.committedOffse
Hello,
Regarding the new Kafka source (configure with a consumer group), I found
out that if I manually change the group's offset with Kafka's admin API
independently of Flink (while the job is running), the Flink source will
ignore that and reset it to whatever it stored internally. Is there any
Hello,
I have a job running with Flink 1.15.0 that consumes from Kafka with the
new KafkaSource API, setting a group ID explicitly and specifying
OffsetsInitializer.earliest() as a starting offset. Today I restarted the
job ignoring both savepoint and checkpoint, and the consumer started
reading
In this case, it should get the offsets from Kafka and
> not the state.
>
> On Thu, Jul 14, 2022 at 11:18 AM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hello,
>>
>> Regarding the new Kafka source (configure with a consumer group), I found
tartFromEarliest()** / **setStartFromLatest()**: Start from the
>> earliest / latest record. Under these modes, committed offsets in Kafka
>> will be ignored and not used as starting positions.*
>>
>> On 13/07/2022 18:53, Alexis Sarda-Espinosa wrote:
>>
>> Hello,
Ok
Regards,
Alexis.
From: Peter Brucia
Sent: Freitag, 22. April 2022 15:31
To: Alexis Sarda-Espinosa
Subject: Re: RocksDB's state size discrepancy with what's seen with state
processor API
No
Sent from my iPhone
Hi everyone,
I know the low level details of this are likely internal, but at a high
level we can say that operators usually have some state associated with
them. Particularly for error handling and job restarts, I imagine windows
must persist state, and operators in general probably persist
Hello,
The documentation for broadcast state specifies that it is always kept in
memory. My assumptions based on this statement are:
1. If a job restarts in the same Flink cluster (i.e. using a restart
strategy), the tasks' attempt number increases and the broadcast state is
restored since it's
Hello,
I found an SO thread that clarifies some details of window state size [1].
I would just like to confirm that this also applies when using a global
window with a custom trigger.
The reason I ask is that the TriggerResult API is meant to cover all
supported scenarios, so FIRE vs
.
> Hope these can help you.
>
>
> On Sat, Oct 8, 2022 at 4:49 AM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hello,
>>
>> I found an SO thread that clarifies some details of window state size
>> [1]. I would just like to confirm
Hi everyone,
I am currently thinking about a use case for a streaming job and, while I'm
fairly certain it cannot be done with the APIs that Flink currently
provides, I figured I'd put it out there in case other users think
something like this would be useful to a wider audience.
The current
as normal keyedstream.
>
>
>
> Best Regards!
>
> 从 Windows 版邮件 <https://go.microsoft.com/fwlink/?LinkId=550986>发送
>
>
>
> *发件人: *Alexis Sarda-Espinosa
> *发送时间: *2022年10月12日 4:11
> *收件人: *user
> *主题: *Partial broadcast/keyed connected streams
>
>
>
>
Hello,
I wrote a test for a broadcast function to check how it handles broadcast
state during retries [1] (the gist only shows a subset of the test in
Kotlin, but it's hopefully understandable). The test will not pass unless
my function also implements CheckpointedFunction, although those
Alexis Sarda-Espinosa <
sarda.espin...@gmail.com>:
> Hi Gyula,
>
> that certainly helps, but to set up automatic cleanup (in my case, of
> azure blob storage), the ideal option would be to be able to set a simple
> policy that deletes blobs that haven't been updated in some t
>
> @Gyula: Please correct me if I misunderstood something here.
>
> I hope that helped.
> Matthias
>
> On Wed, Dec 7, 2022 at 4:19 PM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> I see, thanks for the details.
>>
>> I do mea
Hi Salva,
Just to give you further thoughts from another user, I think the "temporal
join" semantics are very critical in this use case, and what you implement
for that may not easily generalize to other cases. Because of that, I'm not
sure if you can really define best practices that apply in
nt deletion?
> In this case, the checkpoint will be cleaned and not retained and the
> savepoint will remain. So you still could use savepoint to restore.
>
> On Mon, Dec 5, 2022 at 6:33 PM Alexis Sarda-Espinosa <
> sarda.espin...@gmail.com> wrote:
>
>> Hello,
>
1 - 100 of 161 matches
Mail list logo