Hi Jing, > I would suggest adding that information into the FLIP.
Updated now, please review the new version of flip whenever time. > +1 Looking forward to your PR :-) I will request for your review once m ready with PR :-) Bests, Samrat On Tue, Jun 6, 2023 at 11:43 PM Samrat Deb <decordea...@gmail.com> wrote: > Hi Martijn, > > > If I understand this correctly, the Redshift sink > would not be able to support exactly-once, is that correct? > > As I delve deeper into the study of Redshift's capabilities, I have > discovered that it does support "merge into" operations [1] and some > merge into examples [2]. > This opens up the possibility of implementing exactly-once semantics with > the connector. > However, I believe it would be prudent to start with a more focused scope > for the initial phase of implementation and defer the exact-once support > for subsequent iterations. > > Before finalizing the approach, I would greatly appreciate your thoughts > and suggestions on this matter. > Should we prioritize the initial implementation without exactly-once > support, or would you advise incorporating it right from the start? > Your insights and experiences would be immensely valuable in making this > decision. > > > [1] > https://docs.aws.amazon.com/redshift/latest/dg/t_updating-inserting-using-staging-tables-.html > [2] https://docs.aws.amazon.com/redshift/latest/dg/merge-examples.html > > Bests, > Samrat > > On Mon, Jun 5, 2023 at 7:09 PM Jing Ge <j...@ververica.com.invalid> wrote: > >> Hi Samrat, >> >> Thanks for the feedback. I would suggest adding that information into the >> FLIP. >> >> +1 Looking forward to your PR :-) >> >> Best regards, >> Jing >> >> On Sat, Jun 3, 2023 at 9:19 PM Samrat Deb <decordea...@gmail.com> wrote: >> >> > Hi Jing Ge, >> > >> > >>> Do you already have any prototype? I'd like to join the reviews. >> > The prototype is in progress. I will raise the dedicated PR for review >> soon >> > also notify in this thread as well . >> > >> > >>> Will the Redshift connector provide additional features >> > beyond the mediator/wrapper of the jdbc connector? >> > >> > Here are the additional features that the Flink connector for AWS >> Redshift >> > can provide on top of using JDBC: >> > >> > 1. Integration with AWS Redshift Workload Management (WLM): AWS Redshift >> > allows you to configure WLM[1] to manage query prioritization and >> resource >> > allocation. The Flink connector for Redshift will be agnostic to the >> > configured WLM and utilize it for scaling in and out for the sink. This >> > means that the connector can leverage the WLM capabilities of Redshift >> to >> > optimize the execution of queries and allocate resources efficiently >> based >> > on your defined workload priorities. >> > >> > 2. Abstraction of AWS Redshift Quotas and Limits: AWS Redshift imposes >> > certain quotas and limits[2] on various aspects such as the number of >> > clusters, concurrent connections, queries per second, etc. The Flink >> > connector for Redshift will provide an abstraction layer for users, >> > allowing them to work with Redshift without having to worry about these >> > specific limits. The connector will handle the management of connections >> > and queries within the defined quotas and limits, abstracting away the >> > complexity and ensuring compliance with Redshift's restrictions. >> > >> > These features aim to simplify the integration of Flink with AWS >> Redshift, >> > providing optimized resource utilization and transparent handling of >> > Redshift-specific limitations. >> > >> > Bests, >> > Samrat >> > >> > [1] >> > >> > >> https://docs.aws.amazon.com/redshift/latest/dg/cm-c-implementing-workload-management.html >> > [2] >> > >> > >> https://docs.aws.amazon.com/redshift/latest/mgmt/amazon-redshift-limits.html >> > >> > On Sat, Jun 3, 2023 at 11:40 PM Samrat Deb <decordea...@gmail.com> >> wrote: >> > >> > > Hi Ahmed, >> > > >> > > >>> please let me know If you need any collaboration regarding >> > integration >> > > with >> > > AWS connectors credential providers or regarding FLIP-171 I would be >> more >> > > than happy to assist. >> > > >> > > Sure, I will reach out incase of any hands required. >> > > >> > > >> > > >> > > On Fri, Jun 2, 2023 at 6:12 PM Jing Ge <j...@ververica.com.invalid> >> > wrote: >> > > >> > >> Hi Samrat, >> > >> >> > >> Excited to see your proposal. Supporting data warehouses is one of >> the >> > >> major tracks for Flink. Thanks for driving it! Happy to see that we >> > >> reached >> > >> consensus to prioritize the Sink over Source in the previous >> discussion. >> > >> Do >> > >> you already have any prototype? I'd like to join the reviews. >> > >> >> > >> Just out of curiosity, speaking of JDBC mode, according to the FLIP, >> it >> > >> should be doable to directly use the jdbc connector with Redshift, >> if I >> > am >> > >> not mistaken. Will the Redshift connector provide additional features >> > >> beyond the mediator/wrapper of the jdbc connector? >> > >> >> > >> Best regards, >> > >> Jing >> > >> >> > >> On Thu, Jun 1, 2023 at 8:22 PM Ahmed Hamdy <hamdy10...@gmail.com> >> > wrote: >> > >> >> > >> > Hi Samrat >> > >> > >> > >> > Thanks for putting up this FLIP. I agree regarding the importance >> of >> > the >> > >> > use case. >> > >> > please let me know If you need any collaboration regarding >> integration >> > >> with >> > >> > AWS connectors credential providers or regarding FLIP-171 I would >> be >> > >> more >> > >> > than happy to assist. >> > >> > I also like Leonard's proposal for starting with DataStreamSink and >> > >> > TableSink, It would be great to have some milestones delivered as >> soon >> > >> as >> > >> > ready. >> > >> > best regards >> > >> > Ahmed Hamdy >> > >> > >> > >> > >> > >> > On Wed, 31 May 2023 at 11:15, Samrat Deb <decordea...@gmail.com> >> > wrote: >> > >> > >> > >> > > Hi Liu Ron, >> > >> > > >> > >> > > > 1. Regarding the `read.mode` and `write.mode`, you say here >> > >> provides >> > >> > two >> > >> > > modes, respectively, jdbc and `unload or copy`, What is the >> default >> > >> value >> > >> > > for `read.mode` and `write.mode? >> > >> > > >> > >> > > I have made an effort to make the configuration options >> `read.mode` >> > >> and >> > >> > > `write.mode` mandatory for the "flink-connector-redshift" >> according >> > to >> > >> > > FLIP[1]. The rationale behind this decision is to empower users >> who >> > >> are >> > >> > > familiar with their Redshift setup and have specific expectations >> > for >> > >> the >> > >> > > sink. By making these configurations mandatory, users can have >> more >> > >> > control >> > >> > > and flexibility in configuring the connector to meet their >> > >> requirements. >> > >> > > >> > >> > > However, I am open to receiving feedback on whether it would be >> > >> > beneficial >> > >> > > to make the configuration options non-mandatory and set default >> > values >> > >> > for >> > >> > > them. If you believe there are advantages to having default >> values >> > or >> > >> any >> > >> > > other suggestions, please share your thoughts. Your feedback is >> > highly >> > >> > > appreciated. >> > >> > > >> > >> > > > 2. For Source, does it both support batch read and streaming >> > read? >> > >> > > >> > >> > > Redshift currently does not provide native support for streaming >> > >> reads, >> > >> > > although it does support streaming writes[2]. As part of the >> plan, I >> > >> > intend >> > >> > > to conduct a proof of concept and benchmarking to explore the >> > >> > possibilities >> > >> > > of implementing streaming reads using the Flink JDBC connector, >> as >> > >> > Redshift >> > >> > > is JDBC compatible. >> > >> > > However, it is important to note that, in the initial phase of >> > >> > > implementation, the focus will primarily be on supporting batch >> > reads >> > >> > > rather than streaming reads. This approach will allow us to >> deliver >> > a >> > >> > > robust and reliable solution for batch processing in phase 2 of >> the >> > >> > > implementation. >> > >> > > >> > >> > > [1] >> > >> > > >> > >> > > >> > >> > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift >> > >> > > [2] >> > >> > > >> > >> > > >> > >> > >> > >> >> > >> https://docs.aws.amazon.com/redshift/latest/dg/materialized-view-streaming-ingestion.html >> > >> > > >> > >> > > Bests, >> > >> > > Samrat >> > >> > > >> > >> > > On Wed, May 31, 2023 at 8:03 AM liu ron <ron9....@gmail.com> >> wrote: >> > >> > > >> > >> > > > Hi, Samrat >> > >> > > > >> > >> > > > Thanks for driving this FLIP. It looks like supporting >> > >> > > > flink-connector-redshift is very useful to Flink. I have two >> > >> question: >> > >> > > > 1. Regarding the `read.mode` and `write.mode`, you say here >> > >> provides >> > >> > two >> > >> > > > modes, respectively, jdbc and `unload or copy`, What is the >> > default >> > >> > value >> > >> > > > for `read.mode` and `write.mode? >> > >> > > > 2. For Source, does it both support batch read and streaming >> read? >> > >> > > > >> > >> > > > >> > >> > > > Best, >> > >> > > > Ron >> > >> > > > >> > >> > > > Samrat Deb <decordea...@gmail.com> 于2023年5月30日周二 17:15写道: >> > >> > > > >> > >> > > > > [1] >> > >> > > > > >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift >> > >> > > > > >> > >> > > > > [note] Missed the trailing link for previous mail >> > >> > > > > >> > >> > > > > >> > >> > > > > >> > >> > > > > On Tue, May 30, 2023 at 2:43 PM Samrat Deb < >> > decordea...@gmail.com >> > >> > >> > >> > > > wrote: >> > >> > > > > >> > >> > > > > > Hi Leonard, >> > >> > > > > > >> > >> > > > > > > and I’m glad to help review the design as well as the >> code >> > >> > review. >> > >> > > > > > Thank you so much. It would be really great and helpful to >> > bring >> > >> > > > > > flink-connector-redshift for flink users :) . >> > >> > > > > > >> > >> > > > > > I have divided the implementation in 3 phases in the >> `Scope` >> > >> > > > Section[1]. >> > >> > > > > > 1st phase is to >> > >> > > > > > >> > >> > > > > > - Integrate with Flink Sink API (*FLIP-171* >> > >> > > > > > < >> > >> > > > > >> > >> > > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-171%3A+Async+Sink >> > >> > > > > >> > >> > > > > > ) >> > >> > > > > > >> > >> > > > > > >> > >> > > > > > > About the implementation phases, How about prioritizing >> > >> support >> > >> > for >> > >> > > > the >> > >> > > > > > Datastream Sink API and TableSink API in the first phase? >> > >> > > > > > I can completely resonate with you to prioritize support >> for >> > >> > > Datastream >> > >> > > > > > Sink API and TableSink API in the first phase. >> > >> > > > > > I will update the FLIP[1] as you have suggested. >> > >> > > > > > >> > >> > > > > > > It seems that the primary use cases for the Redshift >> > connector >> > >> > are >> > >> > > > > > acting as a sink for processed data by Flink. >> > >> > > > > > Yes, majority ask and requirement for Redshift connector is >> > sink >> > >> > for >> > >> > > > > > processed data by Flink. >> > >> > > > > > >> > >> > > > > > Bests, >> > >> > > > > > Samrat >> > >> > > > > > >> > >> > > > > > On Tue, May 30, 2023 at 12:35 PM Leonard Xu < >> > xbjt...@gmail.com> >> > >> > > wrote: >> > >> > > > > > >> > >> > > > > >> Thanks @Samrat for bringing this discussion. >> > >> > > > > >> >> > >> > > > > >> It makes sense to me to introduce AWS Redshift connector >> for >> > >> > Apache >> > >> > > > > >> Flink, and I’m glad to help review the design as well as >> the >> > >> code >> > >> > > > > review. >> > >> > > > > >> >> > >> > > > > >> About the implementation phases, How about prioritizing >> > support >> > >> > for >> > >> > > > the >> > >> > > > > >> Datastream Sink API and TableSink API in the first phase? >> It >> > >> seems >> > >> > > > that >> > >> > > > > the >> > >> > > > > >> primary use cases for the Redshift connector are acting >> as a >> > >> sink >> > >> > > for >> > >> > > > > >> processed data by Flink. >> > >> > > > > >> >> > >> > > > > >> Best, >> > >> > > > > >> Leonard >> > >> > > > > >> >> > >> > > > > >> >> > >> > > > > >> > On May 29, 2023, at 12:51 PM, Samrat Deb < >> > >> decordea...@gmail.com >> > >> > > >> > >> > > > > wrote: >> > >> > > > > >> > >> > >> > > > > >> > Hello all , >> > >> > > > > >> > >> > >> > > > > >> > Context: >> > >> > > > > >> > Amazon Redshift [1] is a fully managed, petabyte-scale >> data >> > >> > > > warehouse >> > >> > > > > >> > service in the cloud. It allows analyzing data without >> all >> > of >> > >> > the >> > >> > > > > >> > configurations of a provisioned data warehouse. >> Resources >> > are >> > >> > > > > >> automatically >> > >> > > > > >> > provisioned and data warehouse capacity is intelligently >> > >> scaled >> > >> > to >> > >> > > > > >> deliver >> > >> > > > > >> > fast performance for even the most demanding and >> > >> unpredictable >> > >> > > > > >> workloads. >> > >> > > > > >> > Redshift is one of the widely used warehouse solutions >> in >> > the >> > >> > > > current >> > >> > > > > >> > market. >> > >> > > > > >> > >> > >> > > > > >> > Building flink connector redshift will allow flink >> users to >> > >> have >> > >> > > > > source >> > >> > > > > >> and >> > >> > > > > >> > sink directly to redshift. It will help flink to expand >> the >> > >> > scope >> > >> > > to >> > >> > > > > >> > redshift as a new connector in the ecosystem. >> > >> > > > > >> > >> > >> > > > > >> > I would like to start a discussion on the FLIP-307: >> Flink >> > >> > > connector >> > >> > > > > >> > redshift [2]. >> > >> > > > > >> > Looking forward to comments, feedbacks and suggestions >> from >> > >> the >> > >> > > > > >> community >> > >> > > > > >> > on the proposal. >> > >> > > > > >> > >> > >> > > > > >> > [1] >> > >> > https://docs.aws.amazon.com/redshift/latest/mgmt/welcome.html >> > >> > > > > >> > [2] >> > >> > > > > >> > >> > >> > > > > >> >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > >> https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift >> > >> > > > > >> > >> > >> > > > > >> > >> > >> > > > > >> > >> > >> > > > > >> > Bests, >> > >> > > > > >> > Samrat >> > >> > > > > >> >> > >> > > > > >> >> > >> > > > > >> > >> > > > >> > >> > > >> > >> > >> > >> >> > > >> > >> >