Re: [PyFlink] Collect multiple elements in CoProcessFunction

2023-11-22 Thread Alexander Fedulov
unctions, so they use yield to emit results. > > David > > On Tue, Nov 7, 2023 at 1:16 PM Alexander Fedulov < > alexander.fedu...@gmail.com> wrote: > >> Java ProcessFunction API defines a clear way to collect data via the >> Collector object. >> &g

[PyFlink] Collect multiple elements in CoProcessFunction

2023-11-07 Thread Alexander Fedulov
Java ProcessFunction API defines a clear way to collect data via the Collector object. PyFlink documentation also refers to the Collector [1] , but it is not being passed to the function and is also nowhere to be found in the pyflink source code. How can multiple elements be collected? Is "yield"

Re: Flink custom parallel data source

2023-10-31 Thread Alexander Fedulov
ingestion rate exceeds the processing rate. You also lose any delivery guarantees because Flink's fault tolerance model relies on having replayable sources. Is using a message broker not feasible in your case? Best, Alexander Fedulov On Tue, 31 Oct 2023 at 13:08, Kamal Mittal wrote: > He

Re: Flink custom parallel data source

2023-10-31 Thread Alexander Fedulov
operators which you can scale independently from the source parallelism. Can you describe what you are trying to achieve? Best, Alexander Fedulov On Tue, 31 Oct 2023 at 07:22, Kamal Mittal via user wrote: > Hello Community, > > > > I need to have a custom parallel dat

Re: [DISCUSS][FLINK-33240] Document deprecated options as well

2023-10-31 Thread Alexander Fedulov
with this change? Best, Alexander Fedulov On Mon, 30 Oct 2023 at 18:24, Matthias Pohl wrote: > Thanks for your proposal, Zhanghao Chen. I think it adds more transparency > to the configuration documentation. > > +1 from my side on the proposal > > On Wed, Oct 11, 2023 at 2:0

Re: Enhancing File Processing and Kafka Integration with Flink Jobs

2023-10-28 Thread Alexander Fedulov
of the checkpoints you were advising against? > > To be sure, I was referring to moving the previously processed files away, > not the checkpoints themselves. > > On Fri, Oct 27, 2023 at 12:45 PM Alexander Fedulov < > alexander.fedu...@gmail.com> wrote: > >> > I wonde

Re: Enhancing File Processing and Kafka Integration with Flink Jobs

2023-10-27 Thread Alexander Fedulov
> I wonder if you could use this fact to query the committed checkpoints and move them away after the job is done. This is not a robust solution, I would advise against it. Best, Alexander On Fri, 27 Oct 2023 at 16:41, Andrew Otto wrote: > For moving the files: > > It will keep the files as

Re: Invalid Null Check in DefaultFileFilter

2023-10-27 Thread Alexander Fedulov
* with regards to empty string. The null check is still a bit defensive and one could return false in test(), but it does not matter really since String.substring in getName() can never return null. On Fri, 27 Oct 2023 at 16:32, Alexander Fedulov wrote: > Actually, this is not even "d

Re: Invalid Null Check in DefaultFileFilter

2023-10-27 Thread Alexander Fedulov
is it possible to get a null file name for some > sub directories and hence important to return true so that the File Source > can monitor inside those sub directories? > > On Friday, 27 October, 2023 at 12:58:44 am IST, Alexander Fedulov < > alexander.fedu...@gmail.com> wrote: &g

Re: [ANNOUNCE] Apache Flink 1.18.0 released

2023-10-26 Thread Alexander Fedulov
Great work, thanks everyone! Best, Alexander On Thu, 26 Oct 2023 at 21:15, Martijn Visser wrote: > Thank you all who have contributed! > > Op do 26 okt 2023 om 18:41 schreef Feng Jin > > > Thanks for the great work! Congratulations > > > > > > Best, > > Feng Jin > > > > On Fri, Oct 27, 2023

Re: Offset lost with AT_LEAST_ONCE kafka delivery guarantees

2023-10-26 Thread Alexander Fedulov
* to clarify: by different output I mean that for the same input message the output message could be slightly smaller due to the abovementioned factors and fall into the allowed size range without causing any failures On Thu, 26 Oct 2023 at 21:52, Alexander Fedulov wrote: > Your expectati

Re: Offset lost with AT_LEAST_ONCE kafka delivery guarantees

2023-10-26 Thread Alexander Fedulov
to reproduce the error reliably - this is something that needs to be further looked into. Best, Alexander Fedulov On Mon, 23 Oct 2023 at 19:11, Gabriele Modena wrote: > Hey folks, > > We currently run (py) flink 1.17 on k8s (managed by flink k8s > operator), with HA and checkpointing (f

Re: Invalid Null Check in DefaultFileFilter

2023-10-26 Thread Alexander Fedulov
, Alexander Fedulov On Thu, 26 Oct 2023 at 07:35, Chirag Dewan via user wrote: > Hi, > > I was looking at this check in DefaultFileFilter: > > public boolean test(Path path) { > final String fileName = path.getName(); > if (fileName == null || fileName.length() == 0) {

Re: Enhancing File Processing and Kafka Integration with Flink Jobs

2023-10-26 Thread Alexander Fedulov
processed in each file). In case of failures, the source will pick up where it left off. Files removal is trickier - the easiest way to achieve that would be to have tombstones at the end of files and process them in user code. Best, Alexander Fedulov On Thu, 26 Oct 2023 at 18:17, arjun s wrote

Re: CSV Decoder with AVRO schema generated Object

2023-10-26 Thread Alexander Fedulov
file = env.fromSource(source, > WatermarkStrategy.*forMonotonousTimestamps*() > .withTimestampAssigner(new WatermarkAssigner((Object input) > -> System.*currentTimeMillis*())),"FileSource"); > file.print(); > } > > > > > > Regards, > >

Re: CSV Decoder with AVRO schema generated Object

2023-10-26 Thread Alexander Fedulov
es in your reader schema might help [1] [1] https://avro.apache.org/docs/1.8.1/spec.html#Aliases Best, Alexander Fedulov On Thu, 26 Oct 2023 at 16:24, Kirti Dhar Upadhyay K via user < user@flink.apache.org> wrote: > Hi Team, > > > > I am using Flink CSV Decoder with AVSC g

Re: Using HybridSource

2023-07-05 Thread Alexander Fedulov
and kafka topic that return >> different datatypes so I dont know how my answers relate to the original >> problem tbh. Regards, >> Oscar >> >> On Tue, 4 Jul 2023 at 20:53, Alexander Fedulov < >> alexander.fedu...@gmail.com> wrote: >> &

Re: Using HybridSource

2023-07-04 Thread Alexander Fedulov
@Oscar 1. How do you plan to use that CSV data? Is it needed for lookup from the "main" stream? 2. Which API are you using? DataStream/SQL/Table or low level ProcessFunction? Best, Alex On Tue, 4 Jul 2023 at 11:14, Oscar Perez via user wrote: > ok, but is it? As I said, both sources have

Re: Identifying a flink dashboard

2023-06-30 Thread Alexander Fedulov
. > > I use this to add a tag to the header and for the > Flink-Dashboard I experience no glitches. > > > > As to point 3. … you don’t need to expose that Ingress to the internet, > but only to the node IP, so it becomes visible only within your network, … > there i

Re: RocksDB State Backend GET returns null intermittently

2023-06-27 Thread Alexander Fedulov
Hi Prabhu, make sure that the key you use is the same for both records and try to reproduce the issue with the level of parallelism of 1. Best, Alex On Sun, 25 Jun 2023 at 04:29, Hangxiang Yu wrote: > Hi, Prabhu. > > This is a correctness issue. IIUC, It should not be related to the size of >

Re: Identifying a flink dashboard

2023-06-27 Thread Alexander Fedulov
Hi Mike, no, it is currently hard-coded https://github.com/apache/flink/blob/master/flink-runtime-web/web-dashboard/src/app/app.component.html#L23 Your options are: 1. Contribute a change to make it configurable 2. Use some browser plugin that allows renaming page titles 3. Always use different

Re: How to set hdfs configuration in flink kubernetes operator

2023-06-23 Thread Alexander Fedulov
https://nightlies.apache.org/flink/flink-docs-release-1.17/docs/internals/filesystems/ On Fri, 23 Jun 2023 at 11:20, 李 琳 wrote: > > > Hi all, > > > > Recently, I have been testing the Flink Kubernetes Operator. In the > official example, the checkpoint/savepoint path is configured with a file >

Re: When does backpressure matter

2023-06-23 Thread Alexander Fedulov
Hi Lu, I would say that if your application is stable and checkpoints do not timeout there is no immediate necessity to do anything. The fact that the consumer lag stays low means that you are able to keep up with the incoming data. That said, the fact that you observe "constant backpressure"

Re: [ANNOUNCE] Flink Table Store Joins Apache Incubator as Apache Paimon(incubating)

2023-03-27 Thread Alexander Fedulov
Great to see this, congratulations! Best, Alex On Mon, 27 Mar 2023 at 11:24, Yu Li wrote: > Dear Flinkers, > > > > As you may have noticed, we are pleased to announce that Flink Table Store > has joined the Apache Incubator as a separate project called Apache > Paimon(incubating) [1] [2]

Re: UDFs classloading changes in 1.16

2022-11-04 Thread Alexander Fedulov
Hi Leonard, Sure, here is the new ticket: https://issues.apache.org/jira/browse/FLINK-29890 Best, Alexander Fedulov On Fri, Nov 4, 2022 at 2:12 PM Leonard Xu wrote: > Thanks Alexander for reporting this issue, Could you open a jira ticket as > well? > > CC: Shengkai, please

UDFs classloading changes in 1.16

2022-11-04 Thread Alexander Fedulov
/pull/20001 https://github.com/apache/flink/pull/19845 https://github.com/apache/flink/pull/20211 (fixes a similar issue introduced after classloading changes in 1.16) How can UDF JARs be loaded in 1.16? Best, Alexander Fedulov

Re: RE: Skip malformed messages with the KafkaSink

2022-09-08 Thread Alexander Fedulov
Can't you add a flatMap function just before the Sink that does exactly this verification and filters out everything that is not supposed to be sent downstream? Best, Alexander Fedulov On Thu, Sep 8, 2022 at 6:15 PM Salva Alcántara wrote: > Sorry I meant do nothing when the serialize met

Re: Where will the state be stored in the taskmanager when using rocksdbstatebend?

2022-09-06 Thread Alexander Fedulov
8s, taskmanager is a Pod.The data > directory size of a single container is limited in our company.Are there > any idea to deal with this ? > > -- > Best, > Hjw > > > > -- 原始邮件 -- > *发件人:* "Alexander Fedulov"

Re: Flink kafka producer using the same transactional id twice?

2022-09-05 Thread Alexander Fedulov
k/flink-docs-release-1.15/docs/connectors/datastream/kafka/#fault-tolerance https://youtu.be/bhcFfS1-eDY?t=410 https://nightlies.apache.org/flink/flink-docs-release-1.13/docs/ops/state/checkpoints/#unaligned-checkpoints Best, Alexander Fedulov On Mon, Sep 5, 2022 at 2:21 PM Sebastian Struss wrote:

Re: Where will the state be stored in the taskmanager when using rocksdbstatebend?

2022-09-05 Thread Alexander Fedulov
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/deployment/config/#state-backend-rocksdb-localdir Make sure to use a local SSD disk (not NFS/EBS). Best, Alexander Fedulov On Mon, Sep 5, 2022 at 7:24 PM hjw <1010445...@qq.com> wrote: > The EmbeddedRocksDBStateBack

Re: Any usage examples for flink-table-api-java-bridge?

2022-07-18 Thread Alexander Fedulov
You are welcome, glad it helped! Best, Alexander Fedulov On Mon, Jul 18, 2022 at 8:06 PM Salva Alcántara wrote: > For the record, Alexander Fedulov pointed me to an example within the > kafka connector: > > > https://github.com/apache/flink/blob/025675725336cd572aa2601be525efd

Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Alexander Fedulov
Best, Alexander Fedulov On Mon, Jul 18, 2022 at 8:01 PM Salva Alcántara wrote: > Yep, that is mostly it. I have (DataStream) connector (sources & sink) > which works for a fixed type (`JsonNode` for what it's worth) as you say > and I want to reuse it for Table/SQL, which req

Re: Obtain Source (Sink) out of Source (Sink) and f:A->B

2022-07-18 Thread Alexander Fedulov
Hi Salva, what is the goal? Do you have some source that already has a fixed type and you want to reuse its functionality for producing a different data type? Best, Alexander Fedulov On Mon, Jul 18, 2022 at 1:29 PM Salva Alcántara wrote: > If I have a Source (Sink), what wo

Re: Re: Does Table API connector, csv, has some option to ignore some columns

2022-07-11 Thread Alexander Fedulov
ble/formats/csv/ Best, Alexander Fedulov On Mon, Jul 11, 2022 at 5:43 PM wrote: > No, I did not mean. > I said 'Does Table API connector, CSV, has some option to ignore some > columns in source file?' > > > *Sent:* Monday, July 11, 2022 at 5:28 PM > *From:* "Xuyang" &

Re: Unit test have Error "could not find implicit value for evidence parameter"

2022-07-11 Thread Alexander Fedulov
Hi Min Tu, try clean install to make sure the build starts from scratch. Refresh maven modules in IntelliJ after the build. If that doesn't work, try invalidating IntelliJ caches and/or reimporting the project (remove .idea folder). Best, Alexander Fedulov On Sun, Jul 10, 2022 at 12:59 AM

Re: Re: [ANNOUNCE] Apache Flink 1.15.1 released

2022-07-11 Thread Alexander Fedulov
Hi podunk, please share exceptions that you find in the log/ folder of your Flink distribution. The Taskmanger startup issues should be captured in the *-taskexecutor-* files. Best, Alexander Fedulov On Mon, Jul 11, 2022 at 5:42 PM Xuyang wrote: > Hi, can you provide the error log so that

Re: Is Flink able to read a CSV file or just like in Blink this function does not work?

2022-07-08 Thread Alexander Fedulov
verify. start-cluster.sh in Flink 1.15.x works fine on *nix systems . Best, Alexander Fedulov On Fri, Jul 8, 2022 at 7:17 PM wrote: > > Fink will not run natively in windows - that is why I use Github CLI > > I made test with Flink version 1.14.4 - Taskmanager is running. But no &g

Re: Ignoring state's offset when restoring checkpoints

2022-07-08 Thread Alexander Fedulov
Hi Robin, you should be able to use the State Processor API to modify the operator state (sources) and override the offsets manually there. I never tried that, but I believe conceptually it should work. Best, Alexander Fedulov

Re: Is Flink able to read a CSV file or just like in Blink this function does not work?

2022-07-08 Thread Alexander Fedulov
k-dist-1.15.0.jar:1.15.0] > at > org.apache.flink.runtime.taskexecutor.TaskManagerRunner.runTaskManager(TaskManagerRunner.java:481) > ~[flink-dist-1.15.0.jar:1.15.0] > ... 5 more > > I'm trying to find the reason. > > > *Sent:* Friday, July 08, 2022 at 12:21 PM > *

Re: Is Flink able to read a CSV file or just like in Blink this function does not work?

2022-07-08 Thread Alexander Fedulov
heck in the UI that after you start your cluster you have > TaskManagers registered successfully'. > If I go to 'Task Managers' managers menu ( > http://localhost:8081/#/task-manager) I do not see any - list is empty. > > No idea what it should be there or how to make one. > > *Se

Re: Is Flink able to read a CSV file or just like in Blink this function does not work?

2022-07-06 Thread Alexander Fedulov
file is specified, the job fails immediately, so I would actually expect that behavior if the issue was indeed with the file path. Which version of Flink are you running? Best, Alexander Fedulov On Wed, Jul 6, 2022 at 10:39 PM wrote: > If I'm reading Flink manul correc

Re: Restoring a job from a savepoint

2022-07-06 Thread Alexander Fedulov
see messages like this one for operators that did not have any state in the savepoint: *INFO o.a.f.r.c.CheckpointCoordinator [] - Skipping empty savepoint state for operator a0f11f7a2c416beb6b7aed14be0d63ca. * Best, Alexander Fedulov On Wed, Jul 6, 2022 at 9:50 PM John Tipper wrote: > Hi

Re: How to mock new DataSource/Sink

2022-07-04 Thread Alexander Fedulov
-source/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceV0.java [3] https://cwiki.apache.org/confluence/display/FLINK/FLIP-238%3A+Introduce+FLIP-27-based+Data+Generator+Source#:~:text=%7D-,Usage%3A%C2%A0,-The%20envisioned%20usage Best, Alexander Fedulov On Mon

Re: Can FIFO compaction with RocksDB result in data loss?

2022-07-04 Thread Alexander Fedulov
nts, everything is straightforward - all SST files are copied over. The interplay of the incremental checkpoints and compaction is described in this [1] blog post. [1] https://flink.apache.org/features/2018/01/30/incremental-checkpointing.html Best, Alexander Fedulov On Mon, Jul 4, 2022 at 4:25

Re: Kafka Consumer commit error

2022-06-13 Thread Alexander Fedulov
Hi Christian, thanks for the reply. We use AT_LEAST_ONCE delivery semantics in this > application. Do you think this might still be related? No, in that case, Kafka transactions are not used, so it should not be relevant. Best, Alexander Fedulov On Mon, Jun 13, 2022 at 3:48 PM Christ

Re: Kafka Consumer commit error

2022-06-13 Thread Alexander Fedulov
, Alexander Fedulov [1] https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/connectors/datastream/kafka/#fault-tolerance On Mon, Jun 13, 2022 at 12:04 PM Martijn Visser wrote: > Hi Christian, > > I would expect that after the broker comes back up and recovers > completely,

Re: Can we resume a job from a savepoint from Java api?

2022-06-01 Thread Alexander Fedulov
/testsuites/SinkTestSuiteBase.java#L226 Best, Alexander Fedulov On Wed, Jun 1, 2022 at 3:33 PM Qing Lim wrote: > Thanks both, that’s perfect! > > > > *From:* Jing Ge > *Sent:* 01 June 2022 14:29 > *To:* yuxia > *Cc:* Qing Lim ; User > *Subject:* Re: Can we resume a job

Re: Is there an HA solution to run flink job with multiple source

2022-06-01 Thread Alexander Fedulov
Flink jobs. Maybe you could explain where specifically your situation does not fit in one of those two scenarios? Best, Alexander Fedulov On Wed, Jun 1, 2022 at 10:57 PM Jing Ge wrote: > Hi Bariša, > > Could you share the reason why your data processing pipeline should keep > runn

Re: Flink POJO documentation for primitive boolean state variables

2022-01-27 Thread Alexander Fedulov
Hi Mac, I just verified that objects with isXXX methods indeed will be interpreted as POJOs. Would you be willing to contribute a documentation update? Here are some guidelines: [1]. [1] https://flink.apache.org/contributing/contribute-documentation.html Thanks, Alexander Fedulov On Thu

Re: Is it possible to support many different windows for many different keys?

2022-01-26 Thread Alexander Fedulov
https://flink.apache.org/news/2020/07/30/demo-fraud-detection-3.html Best, Alexander Fedulov On Wed, Jan 26, 2022 at 10:45 PM Marco Villalobos wrote: > Hi Alexander, > > Thank you for responding. The solution you proposed uses statically > defined windows. What I need a are dynamically created

Re: Is it possible to support many different windows for many different keys?

2022-01-26 Thread Alexander Fedulov
is no need for separate Flink deployments to create such a pipeline. Best, Alexander Fedulov On Wed, Jan 26, 2022 at 6:47 PM Marco Villalobos wrote: > Hi, > > I am working with time series data in the form of (timestamp, name, > value), and an event time that is the timestamp when

Re: Keystore format limitations for TLS

2021-08-23 Thread Alexander Fedulov
with your certificate into this container at startup - Open a shell into this running connector, locate the "keytool" utility and try to use it to import the certificate Best, Alexander Fedulov | Solutions Architect <https://www.ververica.com/> Follow us @VervericaData

Re: Flink TTL for MapStates and Sideoutputs implementations

2020-05-28 Thread Alexander Fedulov
in the first place. Best, -- Alexander Fedulov | Solutions Architect <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time On Fri, May 22, 2020 at 8:57 AM Jaswin

Re: Multiple Sinks for a Single Soure

2020-05-28 Thread Alexander Fedulov
, ... *)*;` to decide where to dispatch the messages (see [1]), collect to none or many side outputs, depending on your logic. [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/side_output.html -- Alexander Fedulov | Solutions Architect <https://www.ververica.com/> Fol

Re: Question about My Flink Application

2020-05-23 Thread Alexander Fedulov
in this example [2] [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/event_time.html [2] https://github.com/afedulov/fraud-detection-demo/blob/master/flink-job/src/main/java/com/ververica/field/dynamicrules/sources/RulesSource.java#L80 -- Alexander Fedulov | Solutions Architect <ht

Re: How do I get the IP of the master and slave files programmatically in Flink?

2020-05-20 Thread Alexander Fedulov
Hi Felippe, could you clarify in some more details what you are trying to achieve? Best regards, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The

Re: Question about My Flink Application

2020-05-20 Thread Alexander Fedulov
Hi Sara, do you have logs? Any exceptions in them? Best, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing |

Re: Testing process functions

2020-05-20 Thread Alexander Fedulov
here: https://github.com/afedulov/fraud-detection-demo/blob/master/flink-job/src/test/java/com/ververica/field/dynamicrules/RulesEvaluatorTest.java Hope this helps. Best regards, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 <https://www.ververica.com/> Follow us @Ververi

Re: Flink operator throttle

2020-05-20 Thread Alexander Fedulov
to control precisely. Best, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 <https://www.ververica.com/> Follow us @VervericaData -- Join Flink Forward <https://flink-forward.org/> - The Apache Flink Conference Stream Processing | Event Driven | Real Time -- Ve

Re: Watermarks and parallelism

2020-05-14 Thread Alexander Fedulov
llel-streams [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/stream/experimental.html#reinterpreting-a-pre-partitioned-data-stream-as-keyed-stream Best, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 <https://www.ververica.com/> Follow us @VervericaData -- Join

Re: Does it make sense to use init containers for job upgrades in kubernetes

2020-04-30 Thread Alexander Fedulov
erverica.com/blog/announcing-ververica-platform-community-edition [2] https://www.ververica.com/getting-started [3] https://docs.ververica.com/getting_started/index.html Best regards, -- Alexander Fedulov | Solutions Architect +49 1514 6265796 <https://www.ververica.com/> Follow us @Ver

Re: doing demultiplexing using Apache flink

2020-04-30 Thread Alexander Fedulov
ontext); [1] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#streaming-file-sink [2] https://ci.apache.org/projects/flink/flink-docs-stable/dev/connectors/streamfile_sink.html#bucket-assignment -- Alexander Fedulov | Solutions Architect +49 1514 626

Re: doing demultiplexing using Apache flink

2020-04-29 Thread Alexander Fedulov
by the new `KafkaSerializationSchema`, which would require a slight modification, but, from what I can tell, it will still be possible to achieve such dynamic events dispatching. Best regards, Alexander Fedulov >