Thanks Samrat, this makes sense to me and addresses connector coupling concerns.
On Sat, Sep 16, 2023 at 4:30 PM Samrat Deb <decordea...@gmail.com> wrote: > Hi , > > I've made updates to the FLIP[1] by incorporating relevant changes to avoid > using the Flink connector JDBC. This decision was based on the following > reasons: > > AWS Redshift utilizes its specialized JDBC driver[2]. Given that their JDBC > driver may undergo evolutions over time, using the Flink connector JDBC > might face compatibility issues. > Considering the dependency issues mentioned in the thread. > Additionally, a Proof of Concept (POC) has been implemented for a > DynamicSink for Redshift[3]. This showcases that utilizing > flink-connector-jdbc for Redshift doesn't provide significant benefits. > The JDBC mode offers more flexibility by allowing direct use of the JDBC > driver, enabling Flink connector Redshift to evolve independently. > > [1] - > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift > [2] - > > https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-download-driver.html > [3] - > https://github.com/Samrat002/flink-connector-aws/tree/redshift-connector > > Bests, > Samrat > > > > > On Sat, Sep 16, 2023 at 10:46 AM Samrat Deb <decordea...@gmail.com> wrote: > > > Hello Martijn, > > > > I apologize for the delay in responding. > > > > Regarding your question about integrating Redshift directly into the JDBC > > connector, we are planning to offer two modes: JDBC and UNLOAD. Through > our > > internal benchmarking, we have observed good performance in the UNLOAD > > flow. Additionally, there is a need for both flows based on different > user > > use cases. > > > > If we were to explicitly add the JDBC mode to the flink-connector-jdbc, > we > > would have two options: > > > > 1. Include flink-connector-redshift in flink-connector-jdbc: This would > > involve incorporating the Redshift connector into the JDBC connector. > Since > > Redshift is an AWS proprietary product, some authentication utilities can > > be utilized from flink-connector-aws-base. If additional utilities are > > required from the Redshift connector, they could be added to > > flink-connector-aws-base. In my opinion, this approach is favorable as it > > keeps everything related to AWS in flink-connector-aws. > > > > 2. Implement JDBC mode for Redshift sink in flink-connector-jdbc and > > UNLOAD in flink-connector-aws: This alternative is not advisable as it > > could lead to maintenance challenges and complexities. > > > > > > Furthermore, it's important to highlight that Redshift has its own > > customized JDBC driver[1], specifically optimized for compatibility with > > Redshift. While I cannot confirm this definitively, there is a > possibility > > that the Redshift JDBC driver [1] might have differences in compatibility > > when compared to the JDBC driver used in flink-connector-jdbc. This > > suggests that if flink-connector-redshift were to rely on the JDBC > > connector, it could potentially lead to future compatibility issues. > > > > Given these considerations, it seems prudent to maintain the > > Redshift-related functionality within flink-connector-aws and keep the > > Redshift connector independent of the JDBC connector. This approach can > > help ensure that the Redshift connector remains flexible and adaptable to > > any potential changes in JDBC compatibility. > > > > I will update the FLIP[2] to remove dependencies on flink-connector-jdbc. > > > > [1] > > > https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-download-driver.html > > [2] > > > https://cwiki.apache.org/confluence/display/FLINK/FLIP-307%3A++Flink+Connector+Redshift > > > > Bests, > > Samrat > > > > > > On Mon, Sep 11, 2023 at 1:21 PM Martijn Visser <martijnvis...@apache.org > > > > wrote: > > > >> Hi Samrat, > >> > >> I'm still having doubts about the dependency on the JDBC connector. > When a > >> user specifies 'read mode', it will use the JDBC connector under the > hood. > >> Why not integrate Redshift then directly in the JDBC connector itself? > It > >> removes the need for a dependency on the JDBC driver, especially keeping > >> in > >> mind that this driver uses the old SourceFunction/SinkFunction > interfaces > >> because it hasn't been migrated yet. > >> > >> Best regards, > >> > >> Martijn > >> > >> On Mon, Sep 11, 2023 at 8:54 AM Samrat Deb <decordea...@gmail.com> > wrote: > >> > >> > Hi Leonard, > >> > > >> > > Do we have to rely on the latest version of JDBC Connector here? > >> > > >> > No, there's no need for us to depend on the latest version of the JDBC > >> > Connector. Redshift has its dedicated JDBC driver [1], which includes > >> > custom modifications tailored to Redshift's specific implementation > >> needs. > >> > This driver is the most suitable choice for our purposes. > >> > > >> > > >> > > Could you collect the APIs that Redshift generally needs to use? > >> > > >> > I am actively working on it and making progress towards creating the > >> POC. > >> > > >> > Bests, > >> > Samrat > >> > > >> > [1] > >> > > >> > > >> > https://docs.aws.amazon.com/redshift/latest/mgmt/jdbc20-download-driver.html > >> > > >> > On Mon, Sep 11, 2023 at 12:02 PM Samrat Deb <decordea...@gmail.com> > >> wrote: > >> > > >> > > Hello Danny, > >> > > > >> > > I wanted to express my gratitude for your valuable feedback and > >> > insightful > >> > > suggestions. > >> > > > >> > > I will be revising the FLIP to incorporate all of your queries and > >> review > >> > > suggestions. Additionally, I plan to provide a Proof of Concept > (POC) > >> for > >> > > the connector by the end of this week. This POC will address the > >> points > >> > > you've raised and ensure that the FLIP aligns with your > >> recommendations. > >> > > > >> > > Thank you once again for your input. > >> > > > >> > > Bests, > >> > > Samrat > >> > > > >> > > On Thu, Sep 7, 2023 at 10:21 PM Danny Cranmer < > >> dannycran...@apache.org> > >> > > wrote: > >> > > > >> > >> Hello Leonard, > >> > >> > >> > >> > Do we have to rely on the latest version of JDBC Connector here? > I > >> > >> understand that as long as the version of flink minor is the same > as > >> the > >> > >> JDBC Connector, Could you collect the APIs that Redshift generally > >> needs > >> > >> to > >> > >> use? > >> > >> > >> > >> I agree we do not necessarily need to rely on the latest patch > >> version, > >> > >> only the same minor. The main issue for me is the dependency > >> introduces > >> > a > >> > >> blocker following a new Flink version. For example, when Flink > >> 1.18.0 is > >> > >> released we cannot release the AWS connectors until the JDBC is > >> > complete. > >> > >> But I think this is a good tradeoff. > >> > >> > >> > >> > Splitting a separate redshift repository does not solve this > >> coupling > >> > >> problem > >> > >> > >> > >> Arguably it solves the AWS<>JDBC coupling problem, but creates a > new, > >> > more > >> > >> complex one! > >> > >> > >> > >> Thanks, > >> > >> > >> > >> On Thu, Sep 7, 2023 at 5:26 PM Leonard Xu <xbjt...@gmail.com> > wrote: > >> > >> > >> > >> > Thanks Samrat and Danny for driving this FLIP. > >> > >> > > >> > >> > >> an effective approach is to utilize the latest version of > >> > >> > flink-connector-jdbc > >> > >> > > as a Maven dependency > >> > >> > > > >> > >> > > When we have stable source/sink APIs and the connector versions > >> are > >> > >> > > decoupled from Flink this makes sense. But right now this would > >> mean > >> > >> that > >> > >> > > the JDBC connector will block the AWS connector for each new > >> Flink > >> > >> > version > >> > >> > > support release (1.18, 1.19, 1.20, 2.0 etc). That being said, I > >> > cannot > >> > >> > > think of a cleaner alternative, without pulling the core JDBC > >> bits > >> > out > >> > >> > into > >> > >> > > a dedicated project that is decoupled from and released > >> > independently > >> > >> of > >> > >> > > Flink. Splitting flink-connector-redshift into a dedicated repo > >> > would > >> > >> > > decouple AWS/JDBC, but obviously introduce a new connector that > >> is > >> > >> > blocked > >> > >> > > by both AWS and JDBC. > >> > >> > > >> > >> > Do we have to rely on the latest version of JDBC Connector here? > I > >> > >> > understand that as long as the version of flink minor is the same > >> as > >> > the > >> > >> > JDBC Connector, Could you collect the APIs that Redshift > generally > >> > >> needs to > >> > >> > use? > >> > >> > > >> > >> > Assuming that AWS Connector(Redshift) depends on JDBC Connector > and > >> > >> wants > >> > >> > a higher version of JDBC Connector, I understand that the correct > >> > >> approach > >> > >> > is to promote the release of JDBC Connector and looks like we > have > >> no > >> > >> more > >> > >> > options. > >> > >> > > >> > >> > Splitting a separate redshift repository does not solve this > >> coupling > >> > >> > problem, from a user perspective, redshift should also be in the > >> AWS > >> > >> > Connector repo. > >> > >> > > >> > >> > Best, > >> > >> > Leonard > >> > >> > >> > > > >> > > >> > > >