Hi,I am trying to collect files from HDFS in my DataStream job. I need to
collect two types of files - CSV and Parquet.
I understand that Flink supports both formats, but in Streaming mode, Flink
doesnt support splitting these formats. Splitting is only supported in Table
API.
I wanted to
1 可控范围即可。
2 分析阶段可以分开,实际运行阶段看情况,怎样性能高就如何搞。
3 看监控,flink web ui有根据每个节点的反压情况按照不同颜色展示。
星海 <2278179...@qq.com.invalid> 于2023年8月16日周三 22:03写道:
>
> hello。大家好,请教几个问题:
> 1、flink中背压存在是合理的吗?还是在可控范围内就行?还是尽可能没有呢?
> 2、如果出现背压,如果多个operator chain 在一起不好分析,需要先将其拆开分析吗?
>
Hello Community,
Please share views for below.
Rgds,
Kamal
From: Kamal Mittal via user
Sent: 16 August 2023 04:35 PM
To: user@flink.apache.org
Subject: Flink AVRO to Parquet writer - Row group size/Page size
Hello,
For Parquet, default row group size is 128 MB and Page size is 1MB but Flink
Hi Krzysztof,
You may want to check flink-kubernetes-operator-api (
https://mvnrepository.com/artifact/org.apache.flink/flink-kubernetes-operator-api),
here's an example for reading FlinkDeployments
Hi,
I have a use case where I would like to run Flink jobs using Apache Flink
k8s operator.
Where actions like job submission (new and from save point), Job cancel
with save point, cluster creations will be triggered from Java based micro
service.
Is there any recommended/Dedicated Java API for
Hello,
For Parquet, default row group size is 128 MB and Page size is 1MB but Flink
Bulk writer using file sink create the files based on checkpointing interval
only.
So is there any significance of configured row group size and page size for
Flink parquet bulk writer? How Flink uses these
Hi, Dennis.
As Ron said, we could judge this situation by the metrics.
We are usually reporting the metrics to the external system like Prometheus
by the metric reporter[1]. And these metrics could be shown by some other
tools like grafana[2].
Best,
Hang
[1]
Hi Shammon,
Thanks for your response.
If it is a network issue as you have mentioned, how does it read the
contents of the jar file, we can see that the code is read and it throws an
error only when executing the SQL. Also can you let us know exactly what
address could be wrong here, so that we
Hi, Dennis,
Although all operators are chained together, each operator metrics is
there, you can view the metrcis related to the corresponding operator's
input and output records through the UI, as following:
[image: image.png]
Best,
Ron
Dennis Jung 于2023年8月16日周三 14:13写道:
> Hello people,
>
Hello people,
I'm trying to monitor data skewness with 'web-ui', between TaskManagers.
Currently all operators has been chained, so I cannot find how data has
been skewed to TaskManagers (or subtasks). But if I disable chaining,
AFAIK, it can degrade performance.
Hello,
Thank you. I think it is a problem caused by Kafka configuration, not
Flink. I'll take a look and let you know if there's an issue in Flink.
BR,
Jung
2023년 8월 15일 (화) 오후 9:40, Hector Rios 님이 작성:
> Hi there
>
> It would be helpful if you could include the code for your pipeline. One
>
11 matches
Mail list logo