Re: Autoscaling with flink-k8s-operator 1.8.0

2024-05-02 Thread Chetas Joshi
Hi Gyula, Thanks for getting back and explaining the difference in the responsibilities of the autoscaler and the operator. I figured out what the issue was. Here is what I was trying to do: the autoscaler had initially down-scaled (2->1) the flinkDeployment so there was

Re: Autoscaling with flink-k8s-operator 1.8.0

2024-05-01 Thread Gyula Fóra
Hi Chetas, The operator logic itself would normally call the rescale api during the upgrade process, not the autoscaler module. The autoscaler module sets the correct config with the parallelism overrides, and then the operator performs the regular upgrade cycle (as when you yourself change

Re: [Flink Kubernetes Operator] The "last-state" upgrade mode is only supported in FlinkDeployments

2024-05-01 Thread Alan Zhang
Thanks for answering my questions, Gyula! And your insights are very helpful. Let me take a deeper look at the existing logic and think more. On Tue, Apr 30, 2024 at 12:00 PM Gyula Fóra wrote: > The application mode indeed has a sticky jobId (at least when we are > performing a last-state

Re: [Flink Kubernetes Operator] The "last-state" upgrade mode is only supported in FlinkDeployments

2024-04-30 Thread Gyula Fóra
The application mode indeed has a sticky jobId (at least when we are performing a last-state upgrade, otherwise a new jobId is generated during stateless deployments). But that's only part of the story and arguably the less important bit. The last-state upgrade mechanism for running/failing (but

Re: [Flink Kubernetes Operator] The "last-state" upgrade mode is only supported in FlinkDeployments

2024-04-30 Thread Alan Zhang
Hi Gyula, Thanks for your reply! Good suggestion on JIRA ticket, I created a JIRA ticket for tracking it: https://issues.apache.org/jira/browse/FLINK-35279. We could be interested in working on it because of our own requirement, I will check you and the community again once we have some updates.

Re: Looking for help with Job Initialisation issue

2024-04-30 Thread Abhi Sagar Khatri via user
Some more context: Our job graph has 5 different Tasks/operators/flink functions of which we are seeing this issue every time in a particular operator We’re using Unaligned checkpoints. With aligned checkpoint we don’t see this issue but the checkpoint duration in that case is very high and causes

Re: Flink sql retract to append

2024-04-30 Thread Zijun Zhao
以处理时间为升序,处理结果肯定不会出现回撤的,因为往后的时间不会比当前时间小了,你可以在试试这个去重 On Tue, Apr 30, 2024 at 3:35 PM 焦童 wrote: > 谢谢你的建议 但是top-1也会产生回撤信息 > > > 2024年4月30日 15:27,ha.fen...@aisino.com 写道: > > > > 可以参考这个 > > > https://nightlies.apache.org/flink/flink-docs-release-1.19/zh/docs/dev/table/sql/queries/deduplication/ > >

Re: 在idea中用CliFrontend提交job 报错 java.nio.ByteBuffer.position(I)Ljava/nio/ByteBuffer;

2024-04-30 Thread Biao Geng
Hi, 这个报错一般是JDK版本不一致导致的。建议统一build flink和执行flink作业时的Java版本,(都用JDK8 或者 都用JDK11)。 用JDK11时没有sun.misc的问题可以试试勾选掉Idea的Settings-> Build, Execution and Deployment -> Compiler-> Java Compiler的Use '--release' option for cross-compilation' 选项。 Best, Biao Geng z_mmG <13520871...@163.com> 于2024年4月30日周二

Re: Flink sql retract to append

2024-04-30 Thread 焦童
谢谢你的建议 但是top-1也会产生回撤信息 > 2024年4月30日 15:27,ha.fen...@aisino.com 写道: > > 可以参考这个 > https://nightlies.apache.org/flink/flink-docs-release-1.19/zh/docs/dev/table/sql/queries/deduplication/ > 1.11版本不知道是不是支持 > > From: 焦童 > Date: 2024-04-30 11:25 > To: user-zh > Subject: Flink sql retract to append

Re: [Flink Kubernetes Operator] The "last-state" upgrade mode is only supported in FlinkDeployments

2024-04-29 Thread Gyula Fóra
Hi Alan! I think it should be possible to address this gap for most cases. We don't have the same robust way of getting the last-state information for session jobs as we do for applications, so it will be slightly less reliable overall. For session jobs the last checkpoint info has to be queried

Re: Suggestions for aggregating records to a Kinesis Sink, (or generic Async Sink)?

2024-04-29 Thread Michael Marino
Hi Ahmed, hi Hong, Thanks for your responses. It sounds like the most promising would be to initially focus on the Global Window with the custom trigger. We don't need to be compatible with the aggregation used by the KPL (actually we would likely combine records in protobuf, and my impression

Re: Suggestions for aggregating records to a Kinesis Sink, (or generic Async Sink)?

2024-04-29 Thread Ahmed Hamdy
Hi Michael, Unfortunately the new `KinesisDataStreamsSink` doesn't support aggregation yet. My suggestion if you want to use native kinesis aggregation is to use the latest connector version that supports KPL as sink for Table API, that would be 1.14.x. you could package the connector of that

Re: Flink SQL checkpoint failed when running on yarn

2024-04-29 Thread Biao Geng
Hi there, Would you mind sharing the whole JM/TM log? It looks like the error log in the previous email is not the root cause. Best, Biao Geng ou...@139.com 于2024年4月29日周一 16:07写道: > Hi all: >When I ran flink sql datagen source and wrote to jdbc, checkpoint kept > failing with the

Re: Flink SQL Client does not start job with savepoint

2024-04-29 Thread Lee, Keith
April 2024 at 11:37 To: "Lee, Keith" Cc: "user@flink.apache.org" Subject: RE: [EXTERNAL] Flink SQL Client does not start job with savepoint CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm

Re: Strange Problem (0 AvailableTask)

2024-04-28 Thread Hemi Grs
Alright, THanks so much Biao ... On Sun, Apr 28, 2024 at 9:45 AM Biao Geng wrote: > Hi Hemi, > Glad to hear that your problem is solved! > As for deploying a flink cluster, you can check these docs for > more information based on your resource provider: > YARN: >

Re: CSV format and hdfs

2024-04-28 Thread gongzhongqiang
Hi Artem, I research on this and open a issue[1] , Rob Young , Alexander Fedulov and I discuss on this. We also think this performance issue can be solved by manual flush. I had opened a pr[2]. You can cherry pick and package on your local, replace the jar in lib folder. I'm willing to hear

Re: Strange Problem (0 AvailableTask)

2024-04-27 Thread Biao Geng
Hi Hemi, Glad to hear that your problem is solved! As for deploying a flink cluster, you can check these docs for more information based on your resource provider: YARN: https://nightlies.apache.org/flink/flink-docs-master/docs/deployment/resource-providers/yarn/ K8s:

Re: [External] Regarding java.lang.IllegalStateException

2024-04-26 Thread Maxim Senin via user
My guess it’s a major known issue. Need a workaround. https://issues.apache.org/jira/browse/FLINK-32212 /Maxim From: prashant parbhane Date: Tuesday, April 23, 2024 at 11:09 PM To: user@flink.apache.org Subject: [External] Regarding java.lang.IllegalStateException Hello, We have been facing

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Gyula Fóra
s,* > > *Maxim* > > > > *From: *Gyula Fóra > *Date: *Friday, April 26, 2024 at 1:10 AM > *To: *Maxim Senin > *Cc: *Maxim Senin via user > *Subject: *Re: [External] Exception during autoscaling operation - Flink > 1.18/Operator 1.8.0 > > Hi Maxim! > > &

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Maxim Senin via user
oyment [INFO ][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Deleting Kubernetes HA metadata Any ideas? Thanks, Maxim From: Gyula Fóra Date: Friday, April 26, 2024 at 1:10 AM To: Maxim Senin Cc: Maxim Senin via user Subject: Re: [External] Exception during autoscaling operation - Flink 1.18/

Re: Regarding java.lang.IllegalStateException

2024-04-26 Thread Maxim Senin via user
We are also seeing something similar: 2024-04-26 16:30:44,401 INFO org.apache.flink.runtime.executiongraph.ExecutionGraph [] - Source: Power Consumption:power_consumption -> Ingest Power Consumption -> PopSysFields -> WindowingWatermarkPreCheck (1/1)

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Maxim Senin via user
still a mystery. Thanks, Maxim From: Gyula Fóra Date: Friday, April 26, 2024 at 1:10 AM To: Maxim Senin Cc: Maxim Senin via user Subject: Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0 Hi Maxim! Regarding the status update error, it could be related

Re: Strange Problem (0 AvailableTask)

2024-04-26 Thread Hemi Grs
Hi Biao, Thanks for your reply, fortunately the problem is solved. All I did was changed the bind-host to 0.0.0.0 (previously it was set to the server's IP). I don't know if it's best practice or not but everything is working fine now. RIght now I am using flink as standalone (I have the

Re: Flink SQL Client does not start job with savepoint

2024-04-26 Thread Biao Geng
Hi Lee, A quick question: what version of flink are you using for testing execution.state-recovery.path? It looks like this config is only supported in flink 1.20 which is not released yet. Best, Biao Geng Lee, Keith 于2024年4月26日周五 04:51写道:

Re: Strange Problem (0 AvailableTask)

2024-04-26 Thread Biao Geng
Hi Hemi, How do you start your flink cluster? Are you using standalone cluster or using k8s/yarn as resource providers? Also, it would be very helpful if you can share the full jobmanager log. Best, Biao Geng Hemi Grs 于2024年4月18日周四 15:43写道: > Hello, > > I have several versions of Flink

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-26 Thread Gyula Fóra
Hi Maxim! Regarding the status update error, it could be related to a problem that we have discovered recently with the Flink Operator HA. Where during a namespace change both leader and follower instances would start processing. It has been fixed in the current master by updating the JOSDK

Re: Async code inside Flink Sink

2024-04-26 Thread Biao Geng
Hi Jacob, For your first question, I think it is fine to use Java completableFuture for your case. If we create lots of threads, of course it would consume more CPU and influent the processing of records. But in your case, the close op may not be very heavy. One thing comes to mind is that when

Re: [External] Exception during autoscaling operation - Flink 1.18/Operator 1.8.0

2024-04-25 Thread Maxim Senin via user
I have also seen this exception: o.a.f.k.o.o.JobStatusObserver [ERROR][flink/f-d7681d0f-c093-5d8a-b5f5-2b66b4547bf6] Job d0ac9da5959d8cc9a82645eeef6751a5 failed with error: java.util.concurrent.CompletionException: java.util.concurrent.CompletionException:

Re: CSV format and hdfs

2024-04-25 Thread Robert Young
Hi Artem, I had a debug of Flink 1.17.1 (running CsvFilesystemBatchITCase) and I see the same behaviour. It's the same on master too. Jackson flushes [1] the underlying stream after every `writeValue` call. I experimented with disabling the flush by disabling Jackson's FLUSH_PASSED_TO_STREAM [2]

Re: Flink SQL Client does not start job with savepoint

2024-04-25 Thread Lee, Keith
Apologies, I have included the jobmanager log for 6969725a69ecc967aac2ce3eedcc274a instead of 7881d53d28751f9bbbd3581976d9fe3d, however they looked exactly the same. Can include if necessary. Thanks Keith From: "Lee, Keith" Date: Thursday, 25 April 2024 at 21:41 To: "user@flink.apache.org"

Re: Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException

2024-04-23 Thread Yanfei Lei
Hi JM, > why having "transactional.id.expiration.ms" < "transaction.timeout.ms" helps When recover a job from a checkpoint/savepoint which contains Kafka transactions, Flink will try to re-commit those transactions based on transaction ID upon recovery. If

Re:处理时间的滚动窗口提前触发

2024-04-23 Thread Xuyang
Hi, 我看你使用了System.currentTimeMillis(),有可能是分布式的情况下,多台TM上的机器时间不一致导致的吗? -- Best! Xuyang 在 2024-04-20 19:04:14,"hhq" <424028...@qq.com.INVALID> 写道: >我使用了一个基于处理时间的滚动窗口,窗口大小设置为60s,但是我在窗口的处理函数中比较窗口的结束时间和系统时间,偶尔会发现获取到的系统时间早于窗口结束时间(这里的提前量不大,只有几毫秒,但是我不清楚,这是flink窗口本身的原因还是我代码的问题)我没有找到原因,请求帮助

Re: FlinkCEP

2024-04-23 Thread Biao Geng
Hi, As Zhongqiang said, the CEP API is stable. Besides that, there are some changes worth mentioning: 1. https://issues.apache.org/jira/browse/FLINK-23890 Since flink 1.16.0, the timer creation is optimized which can incredibly reduce the resource usage of cep operator given same workload. 2.

Re: FlinkCEP

2024-04-23 Thread gongzhongqiang
Hi, After flink 1.5 , there have been no major changes to the CEP API. Best, Zhongqiang Gong Esa Heikkinen 于2024年4月23日周二 04:19写道: > Hi > > It's been over 5 years since I last did anything with FlinkCEP and Flink. > > Has there been any significant development in FlinkCEP during this time? > >

Re: Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-23 Thread Lei Wang
Sorry, it was probably an observation mistake. I export the metrics to Prometheus and query the result on grafana, actually the usage will not exceed the capacity Thanks, Lei On Fri, Apr 19, 2024 at 9:55 AM Hangxiang Yu wrote: > Hi, Lei. > It's indeed a bit confusing. Could you share the

RE: Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException

2024-04-23 Thread Jean-Marc Paulin
___ From: Yanfei Lei Sent: Monday, April 22, 2024 03:28 To: Jean-Marc Paulin Cc: user@flink.apache.org Subject: [EXTERNAL] Re: Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException Hi JM, Yes, `InvalidPidMappingException` occurs because the tra

Re: Flink 1.18: Unable to resume from a savepoint with error InvalidPidMappingException

2024-04-21 Thread Yanfei Lei
Hi JM, Yes, `InvalidPidMappingException` occurs because the transaction is lost in most cases. For short-term, " transaction.timeout.ms" > "transactional.id.expiration.ms" can ignore the `InvalidPidMappingException`[1]. For long-term, FLIP-319[2] provides a solution. [1]

Re: Flink 1.18.1 cannot read from Kafka

2024-04-21 Thread Phil Stavridis
Thanks Biao. Kind regards Phil > On 14 Apr 2024, at 18:04, Biao Geng wrote: > > Hi Phil, > > You can check my github link > > for a detailed tutorial and example codes :). > > Best, > Biao Geng > > Phil

Re: Why RocksDB metrics cache-usage is larger than cache-capacity

2024-04-18 Thread Hangxiang Yu
Hi, Lei. It's indeed a bit confusing. Could you share the related rocksdb log which may contain more detailed info ? On Fri, Apr 12, 2024 at 12:49 PM Lei Wang wrote: > > I enable RocksDB native metrics and do some performance tuning. > > state.backend.rocksdb.block.cache-size is set to 128m,4

RE: Watermark advancing too quickly when reprocessing events with event time from Kafka

2024-04-18 Thread Tyron Zerafa
Hi, I’m experiencing the same issue on Flink 18.1. I have a slightly different job graph. I have a single Kafka Source (parallelism 6) that is consuming from 2 topics, one topic with 4 partitions and one topic with 6 partitions. The autoWatermarkInteval change to 0 didn’t fix my issue. Did

Re: Flink流批一体应用在实时数仓数据核对场景下有哪些注意事项?

2024-04-18 Thread Yunfeng Zhou
流模式和批模式在watermark和一些算子语义等方面上有一些不同,但没看到Join和Window算子上有什么差异,这方面应该在batch mode下应该是支持的。具体的两种模式的比较可以看一下这个文档 https://nightlies.apache.org/flink/flink-docs-master/zh/docs/dev/datastream/execution_mode/ On Thu, Apr 18, 2024 at 9:44 AM casel.chen wrote: > > 有人尝试这么实践过么?可以给一些建议么?谢谢! > > > > > > > > > > > >

Re: Parallelism for auto-scaling, memory for auto-tuning - Flink operator

2024-04-17 Thread Zhanghao Chen
If you have some experience before, I'd recommend setting a good parallelism and TM resource spec first, to give the autotuner a good starting point. Usually, the autoscaler can tune your jobs well within a few attempts (<=3). As for `pekko.ask.timeout`, the default value should be sufficient

Re: Understanding default firings in case of allowed lateness

2024-04-17 Thread Sachin Mittal
f this is handled as default case ? > > Maybe side output[1] can help you to collect the late data and re-compute > them. > [1] > https://nightlies.apache.org/flink/flink-docs-master/docs/dev/datastream/side_output/ > > -- > Best! > Xuyang > > > At 2024-0

Re: Elasticsearch8 example

2024-04-17 Thread Hang Ruan
Hi Tauseef. I see that the support of Elasticsearch 8[1] will be released in elasticsearch-3.1.0. So there is no docs for the elasticsearch8 by now. We could learn to use it by some tests[2] before the docs is ready. Best, Hang [1] https://issues.apache.org/jira/browse/FLINK-26088 [2]

Re: Table Source from Parquet Bug

2024-04-17 Thread Hang Ruan
Hi, David. Have you added the parquet format[1] dependency in your dependencies? It seems that the class ParquetColumnarRowInputFormat cannot be found. Best, Hang [1] https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/connectors/table/formats/parquet/ Sohil Shah 于2024年4月17日周三

Re: Iceberg connector

2024-04-16 Thread Péter Váry
Hi Chetas, > the only way out to use only the DataStream API (and not the table api) if I want to use a custom splitComparator? You can use watermark generation, and with that, watermark based split ordering using the table api. OTOH, currently there is no way to define a custom comparator using

Re: Iceberg connector

2024-04-16 Thread Chetas Joshi
Hi Péter, Great! Thanks! The resources are really useful. I don't have TABLE_EXEC_ICEBERG_USE_FLIP27_SOURCE set so it is the FlinkSource

Re: Pyflink w Nessie and Iceberg in S3 Jars

2024-04-16 Thread Robert Prat
) ?? From: Péter Váry Sent: Tuesday, April 16, 2024 9:56 PM To: Robert Prat Cc: Oscar Perez via user Subject: Re: Pyflink w Nessie and Iceberg in S3 Jars Is it intentional, that you are using iceberg-flink-runtime-1.16-1.3.1.jar with 1.18.0 PyFlink? This might cause issues later. I would

Re: Pyflink w Nessie and Iceberg in S3 Jars

2024-04-16 Thread Péter Váry
Is it intentional, that you are using iceberg-flink-runtime-1.16-1.3.1.jar with 1.18.0 PyFlink? This might cause issues later. I would try to synchronize the Flink versions throughout all the dependencies. On Tue, Apr 16, 2024, 11:23 Robert Prat wrote: > I finally managed to make it work

Re: Iceberg connector

2024-04-16 Thread Péter Váry
Hi Chetas, See my answers below: On Tue, Apr 16, 2024, 06:39 Chetas Joshi wrote: > Hello, > > I am running a batch flink job to read an iceberg table. I want to > understand a few things. > > 1. How does the FlinkSplitPlanner decide which fileScanTasks (I think one > task corresponds to one

RE: Table Source from Parquet Bug

2024-04-16 Thread Sohil Shah
Hi David, Since this is a ClassNotFoundException, you may be missing a dependency. Could you share your pom.xml. Thanks -Sohil Project: Braineous https://bugsbunnyshah.github.io/braineous/ On 2024/04/16 15:22:34 David Silva via user wrote: > Hi, > > Our team would like to leverage Flink but

Re: Table Source from Parquet Bug

2024-04-16 Thread Sohil Shah
Hello David, Since this is a ClassNotFoundException, you maybe missing a dependency. Could you share your pom.xml. Thanks -Sohil Project: Braineous https://bugsbunnyshah.github.io/braineous/ On Tue, Apr 16, 2024 at 11:25 AM David Silva via user wrote: > Hi, > > Our team would like to leverage

Re: GCS FileSink Read Timeouts

2024-04-16 Thread Dylan Fontana via user
Thanks for the links! We've tried the `gs.writer.chunk.size` before and found it didn't make a meaningful difference unfortunately. The hadoop-connector link you've sent I think is actually not applicable since the gcs Filesystem connector isn't using the hadoop implementation but instead the

Re: Pyflink Performance and Benchmark

2024-04-16 Thread Chase Zhang
On Mon, Apr 15, 2024 at 16:17 Niklas Wilcke wrote: > Hi Flink Community, > u > I wanted to reach out to you to get some input about Pyflink performance. > Are there any resources available about Pyflink benchmarks and maybe a > comparison with the Java API? I wasn't able to find something

Re: Pyflink w Nessie and Iceberg in S3 Jars

2024-04-16 Thread Robert Prat
I finally managed to make it work following the advice of Robin Moffat who replied to the earlier email: There's a lot of permutations that you've described, so it's hard to take one reproducible test case here to try and identify the error :) It certainly looks JAR related. You could try

Re: Optimize exact deduplication for tens of billions data per day

2024-04-15 Thread Alex Cruise
It may not be completely relevant to this conversation in this year, but I find myself sharing this article once or twice a year when opining about how hard deduplication at scale can be.  -0xe1a On Thu, Apr 11, 2024 at 10:22 PM Péter Váry

Re: [EXTERNAL]Re: Pyflink Performance and Benchmark

2024-04-15 Thread Niklas Wilcke
Hi Zhanghao Chen, thanks for sharing the link. This looks quite interesting! Regards, Niklas > On 15. Apr 2024, at 12:43, Zhanghao Chen wrote: > > When it comes down to the actual runtime, what really matters is the plan > optimization and the operator impl & shuffling. You might be

Re: Flink job performance

2024-04-15 Thread Kenan Kılıçtepe
How many taskmanagers and server do you have? Can you also share the task managers page of flink dashboard? On Mon, Apr 15, 2024 at 10:58 AM Oscar Perez via user wrote: > Hi community! > > We have an interesting problem with Flink after increasing parallelism in > a certain way. Here is the

Re: Flink job performance

2024-04-15 Thread Oscar Perez via user
almost full, could you try allocating more CPUs and see if the > instability persists? > > Best, > Zhanghao Chen > -- > *From:* Oscar Perez > *Sent:* Monday, April 15, 2024 19:24 > *To:* Zhanghao Chen > *Cc:* Oscar Perez via user > *Subje

Re: Flink job performance

2024-04-15 Thread Zhanghao Chen
if the instability persists? Best, Zhanghao Chen From: Oscar Perez Sent: Monday, April 15, 2024 19:24 To: Zhanghao Chen Cc: Oscar Perez via user Subject: Re: Flink job performance Hei, ok that is weird. Let me resend them. Regards, Oscar On Mon, 15 Apr 2024 at 14:00, Zhanghao

Re: Flink job performance

2024-04-15 Thread Zhanghao Chen
Hi, there seems to be sth wrong with the two images attached in the latest email. I cannot open them. Best, Zhanghao Chen From: Oscar Perez via user Sent: Monday, April 15, 2024 15:57 To: Oscar Perez via user ; pi-team ; Hermes Team Subject: Flink job

Re: Understanding event time wrt watermarking strategy in flink

2024-04-15 Thread Sachin Mittal
Hi Yunfeng, So regarding the dropping of records for out of order watermark, lats say records later than T - B will be dropped by the first operator after watermarking, which is reading from the source. So then these records will never be forwarded to the step where we do event-time windowing.

Re: Pyflink Performance and Benchmark

2024-04-15 Thread Zhanghao Chen
When it comes down to the actual runtime, what really matters is the plan optimization and the operator impl & shuffling. You might be interested in this blog: https://flink.apache.org/2022/05/06/exploring-the-thread-mode-in-pyflink/, which did a benchmark on the latter with the common the

Re: Flink job performance

2024-04-15 Thread Zhanghao Chen
Hi Oscar, The rebalance operation will go over the network stack, but not necessarily involving remote data shuffle. For data shuffling between tasks of the same node, the local channel is used, but compared to chained operators, it still introduces extra data serialization overhead. For data

Re: Understanding event time wrt watermarking strategy in flink

2024-04-14 Thread Yunfeng Zhou
Hi Sachin, Firstly sorry for my misunderstanding about watermarking in the last email. When you configure an out-of-orderness watermark with a tolerance of B, the next watermark emitted after a record with timestamp T would be T-B instead of T described in my last email. Then let's go back to

RE: Flink 1.18.1 cannot read from Kafka

2024-04-14 Thread Sohil Shah
Hi Phil, if __name__ == "__main__": process_table() error: link_app | Caused by: org.apache.flink.table.api.ValidationException: Could not find any factory for identifier 'kafka' that implements 'org.apache.flink.table.factories.DynamicTableFactory' in the classpath. flink_app | flink_app |

Re: Flink 1.18.1 cannot read from Kafka

2024-04-14 Thread Biao Geng
Hi Phil, You can check my github link for a detailed tutorial and example codes :). Best, Biao Geng Phil Stavridis 于2024年4月12日周五 19:10写道: > Hi Biao, > > Thanks for looking into it and providing a detailed example. >

Re: ProcessWindowFunction中使用per-window state

2024-04-12 Thread gongzhongqiang
你好, 可以通过使用 globalState / windowState 获取之前的状态进行增量计算。 下面这个 demo 可以方便理解: public class ProcessWindowFunctionDemo { public static void main(String[] args) throws Exception { final StreamExecutionEnvironment env = StreamExecutionEnvironment.getExecutionEnvironment(); // 使用处理时间

Re: Understanding event time wrt watermarking strategy in flink

2024-04-12 Thread Sachin Mittal
Hi Yunfeng, I have a question around the tolerance for out of order bound watermarking, What I understand that when consuming from source with out of order bound set as B, lets say it gets a record with timestamp T. After that it will drop all the subsequent records which arrive with the

Re: Understanding event time wrt watermarking strategy in flink

2024-04-12 Thread Yunfeng Zhou
Hi Sachin, 1. When your Flink job performs an operation like map or flatmap, the output records would be automatically assigned with the same timestamp as the input record. You don't need to manually assign the timestamp in each step. So the windowing result in your example should be as you have

Re: Optimize exact deduplication for tens of billions data per day

2024-04-11 Thread Péter Váry
Hi Lei, There is an additional overhead when adding new keys to an operator, since Flink needs to maintain the state, timers etc for the individual keys. If you are interested in more details, I suggest to use the FlinkUI and compare the flamegraph for the stages. There you can see the difference

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
> *To:* Zhanghao Chen > *Cc:* Biao Geng ; user > *Subject:* Re: How to enable RocksDB native metrics? > > Hi Zhanghao, > > flink run -m yarn-cluster -ys 4 -ynm EventCleaning_wl -yjm 2G -ytm 16G > -yqu default -p 8 -yDstate.backend.latency-track.keyed-state-ena

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Zhanghao Chen
Add a space between -yD and the param should do the trick. Best, Zhanghao Chen From: Lei Wang Sent: Thursday, April 11, 2024 19:40 To: Zhanghao Chen Cc: Biao Geng ; user Subject: Re: How to enable RocksDB native metrics? Hi Zhanghao, flink run -m yarn-cluster

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Lei Wang
tyled CLI for YARN jobs where "-yD" instead of "-D" > should be used. > -- > *From:* Lei Wang > *Sent:* Thursday, April 11, 2024 12:39 > *To:* Biao Geng > *Cc:* user > *Subject:* Re: How to enable RocksDB native metrics? &g

Re: How to enable RocksDB native metrics?

2024-04-11 Thread Zhanghao Chen
Hi Lei, You are using an old-styled CLI for YARN jobs where "-yD" instead of "-D" should be used. From: Lei Wang Sent: Thursday, April 11, 2024 12:39 To: Biao Geng Cc: user Subject: Re: How to enable RocksDB native metrics? Hi Biao, I

Re: Optimize exact deduplication for tens of billions data per day

2024-04-11 Thread Lei Wang
Hi Peter, I tried,this improved performance significantly,but i don't know exactly why. According to what i know, the number of keys in RocksDB doesn't decrease. Any specific technical material about this? Thanks, Lei On Fri, Mar 29, 2024 at 9:49 PM Lei Wang wrote: > Perhaps I can

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
Hi Biao, I tried, it doesn't work. The cmd is: flink run -m yarn-cluster -ys 4 -ynm EventCleaning_wl -yjm 2G -ytm 16G -yqu default -p 8 -Dstate.backend.latency-track.keyed-state-enabled=true -c com.zkj.task.EventCleaningTask SourceDataCleaning-wl_0410.jar --sourceTopic dwd_audio_record

Re: How to enable RocksDB native metrics?

2024-04-10 Thread Lei Wang
Hi Biao, I tried, it does On Mon, Apr 8, 2024 at 9:48 AM Biao Geng wrote: > Hi Lei, > You can use the "-D" option in the command line to set configs for a > specific job. E.g, `flink run-application -t > yarn-application -Djobmanager.memory.process.size=1024m `. > See >

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Phil Stavridis
Hi Biao, I will check out running with flink run, but should this be run in the Flink JobManager? Would that mean that the container for the Flink JobManager would require both Python installed and a copy of the flink_client.py module? Are there some examples of running flink run in a

Re: How are window's boundaries decided in flink

2024-04-10 Thread Dylan Fontana via user
Hi Sachin, Assignment for tumbling windows is exclusive on the endTime; see description here https://nightlies.apache.org/flink/flink-docs-release-1.18/docs/dev/datastream/operators/windows/#tumbling-windows . So in your example it would be assigned to window (60, 120) as in reality the windows

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Biao Geng
Hi Phil, It should be totally ok to use `python -m flink_client.job`. It just seems to me that the flink cli is being used more often. And yes, you also need to add the sql connector jar to the flink_client container. After putting the jar in your client container, add codes like

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Phil Stavridis
Hi Biao, 1. I have a Flink client container like this: # Flink client flink_client: container_name: flink_client image: flink-client:local build: context: . dockerfile: flink_client/Dockerfile networks: - standard depends_on: - jobmanager - Kafka The flink_client/Dockerfile has this bash file

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Biao Geng
Hi Phil, Your codes look good. I mean how do you run the python script. Maybe you are using flink cli? i.e. run commands like ` flink run -t .. -py job.py -j /path/to/flink-sql-kafka-connector.jar`. If that's the case, the `-j /path/to/flink-sql-kafka-connector.jar` is necessary so that in client

Re: Flink 1.18.1 cannot read from Kafka

2024-04-10 Thread Phil Stavridis
Hi Biao, For submitting the job, I run t_env.execute_sql. Shouldn’t that be sufficient for submitting the job using the Table API with PyFlink? Isn’t that the recommended way for submitting and running PyFlink jobs on a running Flink cluster? The Flink cluster runs without issues, but there is

Re: flink 已完成job等一段时间会消失

2024-04-09 Thread gongzhongqiang
你好: 如果想长期保留已完成的任务,推荐使用 History Server : https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/deployment/config/#history-server Best, Zhongqiang Gong ha.fen...@aisino.com 于2024年4月9日周二 10:39写道: > 在WEBUI里面,已完成的任务会在completed jobs里面能够看到,过了一会再进去看数据就没有了,是有什么配置自动删除吗? >

Re: Flink 1.18.1 cannot read from Kafka

2024-04-09 Thread Biao Geng
Hi Phil, Thanks for sharing the detailed information of the job. For you question, how to you submit the job? After applying your yaml file, I think you will successfully launch a flink cluster with 1 JM and 1 TM. Then you would submit the pyflink job to the flink cluster. As the error you showed

Re: Use of data generator source

2024-04-09 Thread Lasse Nedergaard
Hi TkachenkoYes I have and we use it extensively for unit testing. But we also have integration testing as part of our project and here I run into the problem.In my previous implementation I used SourceFunction interface and added a delay in the run function. but it’s depredicated so I have

Re: Use of data generator source

2024-04-09 Thread Yaroslav Tkachenko
Hi Lasse, Have you seen this approach https://nightlies.apache.org/flink/flink-docs-release-1.19/docs/dev/datastream/testing/#unit-testing-stateful-or-timely-udfs--custom-operators ? On Tue, Apr 9, 2024 at 7:09 AM Lasse Nedergaard < lassenedergaardfl...@gmail.com> wrote: > Hi. > > I my

Re: Debugging Kryo Fallback

2024-04-09 Thread Salva Alcántara
{ "emoji": "", "version": 1 }

Re: Debugging Kryo Fallback

2024-04-09 Thread Zhanghao Chen
nghao Chen From: Salva Alcántara Sent: Monday, April 8, 2024 16:01 To: Yunfeng Zhou Cc: user@flink.apache.org Subject: Re: Debugging Kryo Fallback Yeah I think you're right and there is no need for anything, really. I was thinking of having more user friendly tests for my POJOs f

Re: Re: 采集mysql全量的时候出现oom问题

2024-04-09 Thread gongzhongqiang
可以尝试的解决办法: - 调大 JM 内存 (如 Shawn Huang 所说) - 调整快照期间批读的大小,以降低 state 大小从而减轻 checkpiont 过程中 JM 内存压力 Best, Zhongqiang Gong wyk 于2024年4月9日周二 16:56写道: > > 是的,分片比较大,有一万七千多个分片 > >

Re: 采集mysql全量的时候出现oom问题

2024-04-08 Thread Shawn Huang
从报错信息看,是由于JM的堆内存不够,可以尝试把JM内存调大,一种可能的原因是mysql表全量阶段分片较多,导致SourceEnumerator状态较大。 Best, Shawn Huang wyk 于2024年4月8日周一 17:46写道: > > > 开发者们好: > flink版本1.14.5 > flink-cdc版本 2.2.0 > > 在使用flink-cdc-mysql采集全量的时候,全量阶段会做checkpoint,但是checkpoint的时候会出现oom问题,这个有什么办法吗? >

Re: How to debug window step in flink

2024-04-08 Thread Sachin Mittal
Hi, Yes it was a watermarking issue. There were few out of order records in my stream and as per watermarking strategy the watermark was advanced to the future and hence current events were getting discarded. I have fixed this by not processing future timestamped records. Thanks Sachin On Mon,

Re: Debugging Kryo Fallback

2024-04-08 Thread Salva Alcántara
Yeah I think you're right and there is no need for anything, really. I was thinking of having more user friendly tests for my POJOs for which I checked the Kryo Fallback and if detected provide an exhaustive list of issues found (vs raising an exception for the first problem, requiring users to

Re: Debugging Kryo Fallback

2024-04-08 Thread Yunfeng Zhou
Hi Salva, Could you please give me some hint about the issues Flink can collect apart from the exception and the existing logs? Suppose we record the exception in the log and the Flink job continues, I can imagine that similar Kryo exceptions from each of the rest records will then appear in the

Re: How to debug window step in flink

2024-04-08 Thread Dominik.Buenzli
Hi Sachin What exactly does the MyReducer do? Can you provide us with some code? Just a wild guess from my side, did you check the watermarking? If the Watermarks aren't progressing there's no way for Flink to know when to emit a window and therefore you won't see any outgoing events. Kind

Re: flink cdc metrics 问题

2024-04-07 Thread Shawn Huang
你好,目前flink cdc没有提供未消费binlog数据条数这样的指标,你可以通过 currentFetchEventTimeLag 这个指标(表示消费到的binlog数据中时间与当前时间延迟)来判断当前消费情况。 [1]

Re: Combining multiple stages into a multi-stage processing pipeline

2024-04-07 Thread Mark Petronic
Thank you Yunfeng. Your comments gave me some insights to explore how to use consecutive windows. So, I coded up a version that looks like this and works well for me: KafkaSource => Keyby => TumblingWindows => ProcessWindowFn => WindowAll => ProcessWindowFn => (Here I will repeated keyed and

Re: Impact on using clean code and serializing everything

2024-04-07 Thread Biao Geng
Hi Oscar, I assume the "dependency" in your description refers to the custom fields in the ProcessFunction's implementation. You are right that as the ProcessFunction inherits `Serializable` interface so we should make all fields either serializable or transient. As for performance, I have no

Re: 退订

2024-04-07 Thread Biao Geng
Hi, If you want to unsubscribe to user-zh mailing list, please send an email with any content to user-zh-unsubscr...@flink.apache.org . 退订请发送任意内容的邮件到 user-zh-unsubscr...@flink.apache.org . Best, Biao Geng 995626544 <995626...@qq.com.invalid> 于2024年4月7日周日 16:06写道: > 退订 > > > > > 995626544 >

  1   2   3   4   5   6   7   8   9   10   >