Hi,

One more additional link, the overall Sources documentation page also
contains useful information. [1]

Best regards,

Martijn

[1]
https://nightlies.apache.org/flink/flink-docs-release-1.14/docs/dev/datastream/sources/

On Fri, 29 Oct 2021 at 09:21, Martijn Visser <mart...@ververica.com> wrote:

> Hi,
>
> When using the DataStream API, the new File Source already supports
> continuous stream, but it isn't documented yet [1] There is an e2e test for
> it, so you could look at that to understand how it works.
> It's not yet supported for the Table API/SQL [2]
>
> Best regards,
>
> Martijn
>
> [1] https://issues.apache.org/jira/browse/FLINK-20188
> [2] https://issues.apache.org/jira/browse/FLINK-20286
>
> On Fri, 29 Oct 2021 at 07:56, Yuval Itzchakov <yuva...@gmail.com> wrote:
>
>> Hi Abhishek,
>>
>> You can use `readFileStream` directly defined on DataStream. You will
>> still have to pay the ListObjects for each iteration using that method.
>> If you want a source that does not rely on listing, you can implement a
>> custom SQS source (there is no official existing one currently) and use
>> Amazon S3 Event Notification to ship to from S3 to SQS:
>> https://docs.aws.amazon.com/AmazonS3/latest/userguide/NotificationHowTo.html
>>
>>
>>
>> On Fri, Oct 29, 2021 at 3:34 AM Abhishek SP <abhisheksp1...@gmail.com>
>> wrote:
>>
>>> Hello,
>>>
>>> I see S3 supported as a Sink through StreamingFileSink
>>> <https://ci.apache.org/projects/flink/flink-docs-master/docs/connectors/datastream/streamfile_sink/>
>>>  but
>>> do not see a source equivalent StreamingFileSource
>>>
>>> *Questions:*
>>> 1. What is the current recommendation for using S3 as a continuous
>>> source for Flink Streaming Application?
>>> 2. If we have to implement a custom S3 continuous source, how would one
>>> implement the SplitEnumerator since ListObjects S3 API can become expensive
>>> as the bucket grows?
>>>
>>> Thanks in advance
>>>
>>> Best,
>>> Abhishek
>>>
>>
>>
>> --
>> Best Regards,
>> Yuval Itzchakov.
>>
>

Reply via email to