Re: Data Stream Processing applications testing

2023-05-22 Thread Giannis Polyzos
Seems interesting, thanks for sharing

On Tue, May 23, 2023 at 2:04 AM Alexandre Strapacao Guedes Vianna <
a...@cin.ufpe.br> wrote:

> Hey everyone,
>
> I wanted to share my latest paper, "A Grey Literature Review on Data
> Stream Processing Applications Testing," in the Journal of Systems and
> Software (JSS), Elsevier.
>
> This paper provides unique industry insights, addresses the challenges
> faced in Data Stream Processing (DSP) application testing, and explores
> many test approaches and tools. It bridges the gap between academia and
> industry.
>
> For a limited time, you can access the paper for free via the following
> link: https://authors.elsevier.com/a/1h7LmbKHpCdKp
>
> I believe this paper could provide valuable knowledge and support for our
> community, and I would be delighted to hear any feedback or thoughts you
> may have after reading it.
>
> Happy reading!
>
> Best Regards,
> Alexandre Vianna
>


Re: IRSA with Flink S3a connector

2023-05-22 Thread Anuj Jain
Hello,
Please provide some pointers on this issue.

Thanks !!

Regards
Anuj

On Fri, May 19, 2023 at 1:34 PM Anuj Jain  wrote:

> Hi Community,
> Looking forward to some advice on the problem.
>
> I also found this similar Jira, but not sure if a fix has been done for
> the Hadoop connector - can someone confirm this.
> [FLINK-23487] IRSA doesn't work with S3 - ASF JIRA (apache.org)
> 
>
> Is there any other way to integrate Flink source/sink with AWS IAM from
> EKS ?
>
> Regards
> Anuj
>
> On Thu, May 18, 2023 at 12:41 PM Anuj Jain  wrote:
>
>> Hi,
>> I have a flink job running on EKS, reading and writing data records to S3
>> buckets.
>> I am trying to set up access credentials via AWS IAM.
>> I followed this:
>> https://docs.aws.amazon.com/eks/latest/userguide/iam-roles-for-service-accounts.html
>>
>> I have configured: com.amazonaws.auth.WebIdentityTokenCredentialsProvider
>> as the credential provider in flink-conf.yaml for hadoop s3a connector, and
>> annotated my service account with the role.
>>
>> When running the job, i am getting access denied error
>> Exception:
>> Caused by:
>> com.amazonaws.services.securitytoken.model.AWSSecurityTokenServiceException:
>> Not authorized to perform sts:AssumeRoleWithWebIdentity (Service:
>> AWSSecurityTokenService; Status Code: 403; Error Code: AccessDenied;
>> Request ID: 923df33a-802e-47e2-a203-0841aca03dd8; Proxy: null)
>>
>> I have tried to access S3 buckets from AWS CLI running in a pod with the
>> same service account and that works.
>>
>> Am I using the correct credential provider for IAM integration, not sure
>> if Hadoop S3a supports it.
>> https://issues.apache.org/jira/browse/HADOOP-18154
>>
>> Please advise if I am doing anything wrong in setting up credentials via
>> IAM.
>>
>> Regards
>> Anuj Jain
>>
>


Re: Question about Flink exception handling

2023-05-22 Thread Sharif Khan via user
Thanks for your response.

For simplicity, I want to capture exceptions in a centralized manner and
log them for further analysis, without interrupting the job's execution or
causing it to restart.

On Tue, May 23, 2023 at 6:31 AM Shammon FY  wrote:

> Hi Sharif,
>
> I would like to know what do you want to do with the exception after
> catching it? There are different ways for different requirements, for
> example, Flink has already reported these exceptions.
>
> Best,
> Shammon FY
>
>
> On Mon, May 22, 2023 at 4:45 PM Sharif Khan via user <
> user@flink.apache.org> wrote:
>
>> Hi, community.
>> Can anyone please let me know?
>>
>> 1. What is the best practice in terms of handling exceptions in Flink
>> jobs?
>>
>> 2. Is there any way to catch exceptions globally in Flink jobs?
>> Basically, I want to catch exceptions from any operators in one place
>> (globally).
>>
>> my expectation is let's say I have a pipeline
>> source-> operator(A) -> operator(B) -> operator(C) -> sink.
>> I don't want to write a try-catch for every operator. Is it possible to
>> write one try-catch for the whole pipeline?
>>
>> I'm using the Python version of the Flink API. version 1.16
>>
>> Thanks in advance.
>>
>> [image: SELISE]
>>
>> SELISE Group
>> Zürich: The Circle 37, 8058 Zürich-Airport, Switzerland
>> Munich: Tal 44, 80331 München, Germany
>> Dubai: Building 3, 3rd Floor, Dubai Design District, Dubai, United Arab
>> Emirates
>> Dhaka: Midas Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh
>> Thimphu: Bhutan Innovation Tech Center, Babesa, P.O. Box 633, Thimphu,
>> Bhutan
>>
>> Visit us: www.selisegroup.com
>>
>> *Important Note: This e-mail and any attachment are confidential and may
>> contain trade secrets and may well also be legally privileged or otherwise
>> protected from disclosure. If you have received it in error, you are on
>> notice of its status. Please notify us immediately by reply e-mail and then
>> delete this e-mail and any attachment from your system. If you are not the
>> intended recipient please understand that you must not copy this e-mail or
>> any attachment or disclose the contents to any other person. Thank you for
>> your cooperation.*
>>
>

-- 

Md. Sharif Khan, BSc in CSE, DIU

Software Engineer

Mobile: +880 1741976078

-- 









SELISE Group
Zürich: The Circle 37, 8058 Zürich-Airport, 
Switzerland
Munich: Tal 44, 80331 München, Germany
Dubai: Building 3, 3rd 
Floor, Dubai Design District, Dubai, United Arab Emirates
Dhaka: Midas 
Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh
Thimphu: Bhutan 
Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan

Visit us: 
www.selisegroup.com 




-- 




*Important Note: This e-mail and any attachment are confidential and 
may contain trade secrets and may well also be legally privileged or 
otherwise protected from disclosure. If you have received it in error, you 
are on notice of its status. Please notify us immediately by reply e-mail 
and then delete this e-mail and any attachment from your system. If you are 
not the intended recipient please understand that you must not copy this 
e-mail or any attachment or disclose the contents to any other person. 
Thank you for your cooperation.*


Re: Backpressure handling in FileSource APIs - Flink 1.16

2023-05-22 Thread Shammon FY
Hi Kamal,

If I understand correctly, do you want the source to do some custom
actions, such as current limiting, when there is backpressure in the job?

Best,
Shammon FY


On Mon, May 22, 2023 at 2:12 PM Kamal Mittal  wrote:

> Hello Community,
>
> Can you please share views about the query asked above w.r.t back pressure
> for  FileSource APIs for Bulk and Record stream formats.
> Planning to use these APIs w.r.t AVRO to Parquet and vice-versa conversion.
>
> Rgds,
> Kamal
>
> On Thu, May 18, 2023 at 2:33 PM Kamal Mittal  wrote:
>
>> Hello Community,
>>
>> Does FileSource APIs for Bulk and Record stream formats handle back
>> pressure by any way like slowing down sending data in piepline further or
>> reading data from source somehow?
>> Or does it give any callback/handle so that any action can be taken? Can
>> you please share details if any?
>>
>> Rgds,
>> Kamal
>>
>


Re: Question about Flink exception handling

2023-05-22 Thread Shammon FY
Hi Sharif,

I would like to know what do you want to do with the exception after
catching it? There are different ways for different requirements, for
example, Flink has already reported these exceptions.

Best,
Shammon FY


On Mon, May 22, 2023 at 4:45 PM Sharif Khan via user 
wrote:

> Hi, community.
> Can anyone please let me know?
>
> 1. What is the best practice in terms of handling exceptions in Flink jobs?
>
> 2. Is there any way to catch exceptions globally in Flink jobs? Basically,
> I want to catch exceptions from any operators in one place (globally).
>
> my expectation is let's say I have a pipeline
> source-> operator(A) -> operator(B) -> operator(C) -> sink.
> I don't want to write a try-catch for every operator. Is it possible to
> write one try-catch for the whole pipeline?
>
> I'm using the Python version of the Flink API. version 1.16
>
> Thanks in advance.
>
> [image: SELISE]
>
> SELISE Group
> Zürich: The Circle 37, 8058 Zürich-Airport, Switzerland
> Munich: Tal 44, 80331 München, Germany
> Dubai: Building 3, 3rd Floor, Dubai Design District, Dubai, United Arab
> Emirates
> Dhaka: Midas Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh
> Thimphu: Bhutan Innovation Tech Center, Babesa, P.O. Box 633, Thimphu,
> Bhutan
>
> Visit us: www.selisegroup.com
>
> *Important Note: This e-mail and any attachment are confidential and may
> contain trade secrets and may well also be legally privileged or otherwise
> protected from disclosure. If you have received it in error, you are on
> notice of its status. Please notify us immediately by reply e-mail and then
> delete this e-mail and any attachment from your system. If you are not the
> intended recipient please understand that you must not copy this e-mail or
> any attachment or disclose the contents to any other person. Thank you for
> your cooperation.*
>


Data Stream Processing applications testing

2023-05-22 Thread Alexandre Strapacao Guedes Vianna
Hey everyone,

I wanted to share my latest paper, "A Grey Literature Review on Data Stream
Processing Applications Testing," in the Journal of Systems and Software
(JSS), Elsevier.

This paper provides unique industry insights, addresses the challenges
faced in Data Stream Processing (DSP) application testing, and explores
many test approaches and tools. It bridges the gap between academia and
industry.

For a limited time, you can access the paper for free via the following
link: https://authors.elsevier.com/a/1h7LmbKHpCdKp

I believe this paper could provide valuable knowledge and support for our
community, and I would be delighted to hear any feedback or thoughts you
may have after reading it.

Happy reading!

Best Regards,
Alexandre Vianna


Re: Flink Kubernetes Operator lifecycle state count metrics question

2023-05-22 Thread Andrew Otto
Also!  I do have 2 FlinkDeployments deployed with this operator, but they
are in different namespaces, and each of the per namespace metrics reports
that it has 2 Deployments in them, even though there is only one according
to kubectl.

Actually...we just tried to deploy a change (enabling some checkpointing)
that caused one of our FlinkDeployments to fail.  Now, both namespace
STABLE_Counts each report 1.

# curl -s : | grep
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
1.0
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="rdf_streaming_updater",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
1.0

It looks like maybe this metric is not reporting per namespace, but a
global count.



On Mon, May 22, 2023 at 2:56 PM Andrew Otto  wrote:

> Oh, FWIW, I do have operator HA enabled with 2 replicas running, but in my
> examples there, I am curl-ing the leader flink operator pod.
>
>
>
> On Mon, May 22, 2023 at 2:47 PM Andrew Otto  wrote:
>
>> Hello!
>>
>> I'm doing some grafana+prometheus dashboarding for
>> flink-kubernetes-operator.  Reading metrics docs
>> , I see that I have nice per k8s
>> namespace lifecycle current count gauge metrics in Prometheus.
>>
>> Using kubectl, I can see that I have one FlinkDeployment in my namespace:
>>
>> # kubectl -n stream-enrichment-poc get flinkdeployments
>> NAME JOB STATUS   LIFECYCLE STATE
>> flink-app-main   RUNNING  STABLE
>>
>> But, prometheus is reporting that I have 2 FlinkDeployments in the STABLE
>> state.
>>
>> # curl -s :  | grep
>> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
>> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
>> 2.0
>>
>> I'm not sure why I see 2.0 reported.
>> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count reports only
>> one FlinkDeployment.
>>
>> # curl :/metrics | grep
>> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count
>> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
>> 1.0
>>
>> Is it possible that
>> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count is being
>> reported as an incrementing counter instead of a guage?
>>
>> Thanks
>> -Andrew Otto
>>  Wikimedia Foundation
>>
>>


Re: Flink Kubernetes Operator lifecycle state count metrics question

2023-05-22 Thread Andrew Otto
Oh, FWIW, I do have operator HA enabled with 2 replicas running, but in my
examples there, I am curl-ing the leader flink operator pod.



On Mon, May 22, 2023 at 2:47 PM Andrew Otto  wrote:

> Hello!
>
> I'm doing some grafana+prometheus dashboarding for
> flink-kubernetes-operator.  Reading metrics docs
> , I see that I have nice per k8s
> namespace lifecycle current count gauge metrics in Prometheus.
>
> Using kubectl, I can see that I have one FlinkDeployment in my namespace:
>
> # kubectl -n stream-enrichment-poc get flinkdeployments
> NAME JOB STATUS   LIFECYCLE STATE
> flink-app-main   RUNNING  STABLE
>
> But, prometheus is reporting that I have 2 FlinkDeployments in the STABLE
> state.
>
> # curl -s :  | grep
> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
> 2.0
>
> I'm not sure why I see 2.0 reported.
> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count reports only
> one FlinkDeployment.
>
> # curl :/metrics | grep
> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count
> flink_k8soperator_namespace_JmDeploymentStatus_READY_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
> 1.0
>
> Is it possible that
> flink_k8soperator_namespace_Lifecycle_State_STABLE_Count is being
> reported as an incrementing counter instead of a guage?
>
> Thanks
> -Andrew Otto
>  Wikimedia Foundation
>
>


Flink Kubernetes Operator lifecycle state count metrics question

2023-05-22 Thread Andrew Otto
Hello!

I'm doing some grafana+prometheus dashboarding for
flink-kubernetes-operator.  Reading metrics docs
, I see that I have nice per k8s
namespace lifecycle current count gauge metrics in Prometheus.

Using kubectl, I can see that I have one FlinkDeployment in my namespace:

# kubectl -n stream-enrichment-poc get flinkdeployments
NAME JOB STATUS   LIFECYCLE STATE
flink-app-main   RUNNING  STABLE

But, prometheus is reporting that I have 2 FlinkDeployments in the STABLE
state.

# curl -s :  | grep
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
flink_k8soperator_namespace_Lifecycle_State_STABLE_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
2.0

I'm not sure why I see 2.0 reported.
flink_k8soperator_namespace_JmDeploymentStatus_READY_Count reports only one
FlinkDeployment.

# curl :/metrics | grep
flink_k8soperator_namespace_JmDeploymentStatus_READY_Count
flink_k8soperator_namespace_JmDeploymentStatus_READY_Count{resourcetype="FlinkDeployment",resourcens="stream_enrichment_poc",name="flink_kubernetes_operator",host="flink_kubernetes_operator_86b888d6b6_gbrt4",namespace="flink_operator",}
1.0

Is it possible that flink_k8soperator_namespace_Lifecycle_State_STABLE_Count
is being reported as an incrementing counter instead of a guage?

Thanks
-Andrew Otto
 Wikimedia Foundation


Re: Maven plugin to detect issues early on

2023-05-22 Thread Jing Ge via user
cc user ML to get more attention, since the plugin will be used by Flink
application developers.

Best regards,
Jing

On Mon, May 22, 2023 at 3:32 PM Jing Ge  wrote:

> Hi Emre,
>
> Thanks for clarifying it. Afaiac, it is a quite interesting proposal,
> especially for Flink job developers who are heavily using the Datastream
> API. Publishing the plugin in your Github would be a feasible way for the
> first move. As I mentioned previously, in order to help the community
> understand the plugin, you might want to describe all those attractive
> features you mentioned in e.g. the readme.md with more technical details. I
> personally was wondering how those connector compatibility rules will be
> defined and maintained, given that almost all connectors have been
> externalized.
>
> Best regards,
> Jing
>
> On Mon, May 22, 2023 at 11:24 AM Kartoglu, Emre
>  wrote:
>
>> Hi Jing,
>>
>> The proposed plugin would be used by Flink application developers, when
>> they are writing their Flink job. It would trigger during
>> compilation/packaging and would look for known incompatibilities, bad
>> practices, or bugs.
>> For instance one cause of frustration for our customers is connector
>> incompatibilities (specifically Kafka and Kinesis) with certain Flink
>> versions. This plugin would be a quick way to update a list of known
>> incompatibilities, bugs, bad practices, so customers get errors during
>> compilation/packaging and not after they've deployed their Flink job.
>>
>> From what you're saying, the FLIP route might not be the best way to go.
>> We might publish this plugin in our own GitHub namespace/group first, and
>> then get community acknowledgement/support for it. I believe working with
>> the Flink community on this is key as we'd need their support/opinion to do
>> this the right way and reach more Flink users.
>>
>> Thanks
>> Emre
>>
>> On 21/05/2023, 16:48, "Jing Ge" > j...@ververica.com.inva>LID> wrote:
>>
>>
>> CAUTION: This email originated from outside of the organization. Do not
>> click links or open attachments unless you can confirm the sender and know
>> the content is safe.
>>
>>
>>
>>
>>
>>
>> Hi Emre,
>>
>>
>> Thanks for your proposal. It looks very interesting! Please pay attention
>> that most connectors have been externalized. Will your proposed plug be
>> used for building Flink Connectors or Flink itself? Furthermore, it would
>> be great if you could elaborate features wrt best practices so that we
>> could understand how the plugin will help us.
>>
>>
>> Afaik, FLIP is recommended for improvement ideas that will change public
>> APIs. I am not sure if a new maven plugin belongs to it.
>>
>>
>> Best regards,
>> Jing
>>
>>
>> On Tue, May 16, 2023 at 11:29 AM Kartoglu, Emre > lid>
>> wrote:
>>
>>
>> > Hello all,
>> >
>> > Myself and 2 colleagues developed a Maven plugin (no support for Gradle
>> or
>> > other build tools yet) that we use internally to detect potential
>> issues in
>> > Flink apps at compilation/packaging stage:
>> >
>> >
>> > * Known connector version incompatibilities – so far covering Kafka
>> > and Kinesis
>> > * Best practices e.g. setting operator IDs
>> >
>> > We’d like to make this open-source. Ideally with the Flink community’s
>> > support/mention of it on the Flink website, so more people use it.
>> >
>> > Going forward, I believe we have at least the following options:
>> >
>> > * Get community support: Create a FLIP to discuss where the plugin
>> > should live, what kind of problems it should detect etc.
>> > * We still open-source it but without the community support (if the
>> > community has objections to officially supporting it for instance).
>> >
>> > Just wanted to gauge the feeling/thoughts towards this tool from the
>> > community before going ahead.
>> >
>> > Thanks,
>> > Emre
>> >
>> >
>>
>>
>>
>>


Question about Flink exception handling

2023-05-22 Thread Sharif Khan via user
Hi, community.
Can anyone please let me know?

1. What is the best practice in terms of handling exceptions in Flink jobs?

2. Is there any way to catch exceptions globally in Flink jobs? Basically,
I want to catch exceptions from any operators in one place (globally).

my expectation is let's say I have a pipeline
source-> operator(A) -> operator(B) -> operator(C) -> sink.
I don't want to write a try-catch for every operator. Is it possible to
write one try-catch for the whole pipeline?

I'm using the Python version of the Flink API. version 1.16

Thanks in advance.

-- 









SELISE Group
Zürich: The Circle 37, 8058 Zürich-Airport, 
Switzerland
Munich: Tal 44, 80331 München, Germany
Dubai: Building 3, 3rd 
Floor, Dubai Design District, Dubai, United Arab Emirates
Dhaka: Midas 
Center, Road 16, Dhanmondi, Dhaka 1209, Bangladesh
Thimphu: Bhutan 
Innovation Tech Center, Babesa, P.O. Box 633, Thimphu, Bhutan

Visit us: 
www.selisegroup.com 




-- 




*Important Note: This e-mail and any attachment are confidential and 
may contain trade secrets and may well also be legally privileged or 
otherwise protected from disclosure. If you have received it in error, you 
are on notice of its status. Please notify us immediately by reply e-mail 
and then delete this e-mail and any attachment from your system. If you are 
not the intended recipient please understand that you must not copy this 
e-mail or any attachment or disclose the contents to any other person. 
Thank you for your cooperation.*


Re: 提问

2023-05-22 Thread Leonard Xu
(1)可以检查下是不是其他作业或者同步工具使用了对应的server-id
(2) server-id 可以尝试用机器IP+时间戳来生成,这样能尽可能避免冲突

祝好,
雪尽

> On May 22, 2023, at 3:34 PM, 曹明勤  wrote:
> 
> 在我提交的flink-cdc-mysql的任务中,需要flink同步多张表的数据,但是我遇到了server-id重复的问题。我尝试过设置随机数,但是server-id有一定的取值范围,并且随机数还是有可能重复。官方文档建议我将server-id设置为一个范围,比如5400-6400,并且设置flink的并行度。这些我都做了,但是当我同步表的数量较多时,还是会出现server-id重复的问题导致任务提交失败。我需要如何设置才能如何避免这种错误?
> 
> 
> 
> 
> In the Flinks-cdc-mysql task I submitted, flink was required to synchronize 
> data of multiple tables, but I encountered the problem of server-id 
> duplication. I tried to set a random number, but server-id has a range of 
> values, and random numbers can be repeated. The official documentation 
> advised me to set server-id to a range, such as 5400-6400, and set flink's 
> parallelism. I did all of this, but when I synchronized a large number of 
> tables, I still had the problem of server-id duplication, which caused the 
> task submission to fail. What do I need to set up to avoid this error?



提问

2023-05-22 Thread 曹明勤
在我提交的flink-cdc-mysql的任务中,需要flink同步多张表的数据,但是我遇到了server-id重复的问题。我尝试过设置随机数,但是server-id有一定的取值范围,并且随机数还是有可能重复。官方文档建议我将server-id设置为一个范围,比如5400-6400,并且设置flink的并行度。这些我都做了,但是当我同步表的数量较多时,还是会出现server-id重复的问题导致任务提交失败。我需要如何设置才能如何避免这种错误?




In the Flinks-cdc-mysql task I submitted, flink was required to synchronize 
data of multiple tables, but I encountered the problem of server-id 
duplication. I tried to set a random number, but server-id has a range of 
values, and random numbers can be repeated. The official documentation advised 
me to set server-id to a range, such as 5400-6400, and set flink's parallelism. 
I did all of this, but when I synchronized a large number of tables, I still 
had the problem of server-id duplication, which caused the task submission to 
fail. What do I need to set up to avoid this error?

Re: Backpressure handling in FileSource APIs - Flink 1.16

2023-05-22 Thread Kamal Mittal
Hello Community,

Can you please share views about the query asked above w.r.t back pressure
for  FileSource APIs for Bulk and Record stream formats.
Planning to use these APIs w.r.t AVRO to Parquet and vice-versa conversion.

Rgds,
Kamal

On Thu, May 18, 2023 at 2:33 PM Kamal Mittal  wrote:

> Hello Community,
>
> Does FileSource APIs for Bulk and Record stream formats handle back
> pressure by any way like slowing down sending data in piepline further or
> reading data from source somehow?
> Or does it give any callback/handle so that any action can be taken? Can
> you please share details if any?
>
> Rgds,
> Kamal
>