Re: How to mock new DataSource/Sink

2022-07-06 Thread David Jost
> On 5. Jul 2022, at 01:48, Alexander Fedulov  wrote:
> 
> Hi David,
> 
> I started working on FLIP-238 exactly with the concerns you've mentioned in 
> mind. It is currently in development, feel free to join the discussion [1]. 
> If you need something ASAP and are not interested in rate-limiting 
> functionality, you could drop in this [2] class into your tests suite (this 
> version is standalone and does not require changes in any other classes). The 
> usage is as indicated in the FLIP (minus sourceRatePerSecond parameter) [3].
> 
> [1] https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt
> [2] 
> https://github.com/afedulov/flink/blob/FLINK-27919-generator-source/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceV0.java
> [3] 
> https://cwiki.apache.org/confluence/display/FLINK/FLIP-238%3A+Introduce+FLIP-27-based+Data+Generator+Source#:~:text=%7D-,Usage%3A%C2%A0,-The%20envisioned%20usage
> 
> Best,
> Alexander Fedulov


> On Mon, Jul 4, 2022 at 12:51 PM Chesnay Schepler  wrote:
> It is indeed not easy to mock sources/sink with the new interfaces.
> 
> There is an effort to make this easier for sources in the future (FLIP-238).
> 
> For the time being I'd stick with the old APIs for mock sources/sinks.
> 


Hi Chesnay and Alexander,

thank you both for your answers and pointing me to FLIP-238. I will 
definitively have a look at your DataGeneratorSource, Alexander; thank you. 
Then I will only need a sink to go with it. :) But knowing what the state is 
helps tremendously with our decisions.

Thank you again for you help and have a great day.

Best
  David

smime.p7s
Description: S/MIME cryptographic signature


Re: How to mock new DataSource/Sink

2022-07-04 Thread Alexander Fedulov
Hi David,

I started working on FLIP-238 exactly with the concerns you've mentioned in
mind. It is currently in development, feel free to join the discussion [1].
If you need something ASAP and are not interested in rate-limiting
functionality, you could drop in this [2] class into your tests suite (this
version is standalone and does not require changes in any other classes).
The usage is as indicated in the FLIP (minus sourceRatePerSecond parameter)
[3].

[1] https://lists.apache.org/thread/7gjxto1rmkpff4kl54j8nlg5db2rqhkt
[2]
https://github.com/afedulov/flink/blob/FLINK-27919-generator-source/flink-core/src/main/java/org/apache/flink/api/connector/source/lib/DataGeneratorSourceV0.java
[3]
https://cwiki.apache.org/confluence/display/FLINK/FLIP-238%3A+Introduce+FLIP-27-based+Data+Generator+Source#:~:text=%7D-,Usage%3A%C2%A0,-The%20envisioned%20usage

Best,
Alexander Fedulov


On Mon, Jul 4, 2022 at 12:51 PM Chesnay Schepler  wrote:

> It is indeed not easy to mock sources/sink with the new interfaces.
>
> There is an effort to make this easier for sources in the future (FLIP-238
> 
> ).
>
> For the time being I'd stick with the old APIs for mock sources/sinks.
>
> On 04/07/2022 10:23, David Jost wrote:
>
> Hi,
>
> we are currently looking at replacing our sinks and sources with the 
> respective counterparts using the 'new' data source/sink API (mainly Kafka). 
> What holds us back is that we are not sure how to test the pipeline with 
> mocked sources/sinks. Up till now, we somewhat followed the 'Testing docs'[0] 
> and created a simple SinkFunction, as well as a ParallelSourceFunction, where 
> we could get data in and out at our leisure. They could be easily plugged 
> into the pipeline for the tests. But with the new API, it seems way too 
> cumbersome to go such an approach, as there is a lot of overhead in creating 
> a sink or source on your own (now).
>
> I would love to know, what the intended or recommended way is here. I know, 
> that I can still use the old API, but that a) feels wrong, and b) requires us 
> to expose the DataStream, which is not necessary in the current setup.
>
> I appreciate any ideas or even examples on this.
>
> Thank you in advance.
>
> Best
>   David
>
>
> [0]: 
> https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/testing/#testing-flink-jobs
>
>
>


Re: How to mock new DataSource/Sink

2022-07-04 Thread Chesnay Schepler

It is indeed not easy to mock sources/sink with the new interfaces.

There is an effort to make this easier for sources in the future 
(FLIP-238 
).


For the time being I'd stick with the old APIs for mock sources/sinks.

On 04/07/2022 10:23, David Jost wrote:

Hi,

we are currently looking at replacing our sinks and sources with the respective 
counterparts using the 'new' data source/sink API (mainly Kafka). What holds us 
back is that we are not sure how to test the pipeline with mocked 
sources/sinks. Up till now, we somewhat followed the 'Testing docs'[0] and 
created a simple SinkFunction, as well as a ParallelSourceFunction, where we 
could get data in and out at our leisure. They could be easily plugged into the 
pipeline for the tests. But with the new API, it seems way too cumbersome to go 
such an approach, as there is a lot of overhead in creating a sink or source on 
your own (now).

I would love to know, what the intended or recommended way is here. I know, 
that I can still use the old API, but that a) feels wrong, and b) requires us 
to expose the DataStream, which is not necessary in the current setup.

I appreciate any ideas or even examples on this.

Thank you in advance.

Best
   David


[0]:https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/testing/#testing-flink-jobs




How to mock new DataSource/Sink

2022-07-04 Thread David Jost
Hi,

we are currently looking at replacing our sinks and sources with the respective 
counterparts using the 'new' data source/sink API (mainly Kafka). What holds us 
back is that we are not sure how to test the pipeline with mocked 
sources/sinks. Up till now, we somewhat followed the 'Testing docs'[0] and 
created a simple SinkFunction, as well as a ParallelSourceFunction, where we 
could get data in and out at our leisure. They could be easily plugged into the 
pipeline for the tests. But with the new API, it seems way too cumbersome to go 
such an approach, as there is a lot of overhead in creating a sink or source on 
your own (now).

I would love to know, what the intended or recommended way is here. I know, 
that I can still use the old API, but that a) feels wrong, and b) requires us 
to expose the DataStream, which is not necessary in the current setup.

I appreciate any ideas or even examples on this.

Thank you in advance.

Best
  David


[0]: 
https://nightlies.apache.org/flink/flink-docs-release-1.15/docs/dev/datastream/testing/#testing-flink-jobs

smime.p7s
Description: S/MIME cryptographic signature