Hey all, Thanks for the nice flip all! I’m just reading through – had one question on the ALTER CONNECTION implementation flow. Would it make sense for the WritableSecretStore to expose a method for updating a secret by ID, so it can be done atomically? Else, would we need to call delete and create again, potentially introducing concurrent resolution errors?
Best, Austin On Thu, Jul 17, 2025 at 13:07 Ryan van Huuksloot <ryan.vanhuuksl...@shopify.com.invalid> wrote: > Hi Mayank, > > Thanks for updating the FLIP. Overall it looks good to me. > > One question I had related to how someone could choose the SecretStore they > want to use if they use something like the SQL Gateway as the entrypoint on > top of a remote Session cluster. I don't see an explicit way to set the > SecretStore in the FLIP. > I assume we'll do it similar to the CatalogStore but I wanted to call this > out. > > table.catalog-store.kind: filetable.catalog-store.file.path: > file:///path/to/catalog/store/ > > Ryan van Huuksloot > Staff Engineer, Infrastructure | Streaming Platform > [image: Shopify] > <https://www.shopify.com/?utm_medium=salessignatures&utm_source=hs_email> > > > On Wed, Jul 16, 2025 at 2:22 PM Mayank Juneja <mayankjunej...@gmail.com> > wrote: > > > Hi everyone, > > > > Thanks for your valuable inputs. I have updated the FLIP with the ideas > > proposed earlier in the thread. Looking forward to your feedback. > > https://cwiki.apache.org/confluence/x/cYroF > > > > Best, > > Mayank > > > > On Fri, Jun 27, 2025 at 2:59 AM Leonard Xu <xbjt...@gmail.com> wrote: > > > > > Quick response, thanks Mayank, Hao and Timo for the effort. The new > > > proposal looks well, +1 from my side. > > > > > > Could you draft(update) current FLIP docs thus we can have some > specific > > > discussions later? > > > > > > > > > Best, > > > Leonard > > > > > > > > > > 2025 6月 26 15:06,Timo Walther <twal...@apache.org> 写道: > > > > > > > > Hi everyone, > > > > > > > > sorry for the late reply, feature freeze kept me busy. Mayank, Hao > and > > I > > > synced offline and came up we an improved proposal. Before we update > the > > > FLIP let me summarize the most important key facts that hopefully > address > > > most concerns: > > > > > > > > 1) SecretStore > > > > - Similar to CatalogStore, we introduce a SecretStore as the highest > > > level in TableEnvironment. > > > > - SecretStore is initialized with options and potentially environment > > > variables. Including EnvironmentSettings.withSecretStore(SecretStore). > > > > - The SecretStore is pluggable and discovered using the regular > > > factory-approach. > > > > - For example, it could implement Azure Key Vault or other cloud > > > provider secrets stores. > > > > - Goal: Flink and Flink catalogs do not have to deal with sensitive > > data. > > > > > > > > 2) Connections > > > > - Connections are catalog objects identified with 3-part identifiers. > > > 3-part identifiers are crucial for managability of larger projects and > > > align with existing catalog objects. > > > > - They contain connection details, e.g. URL, query parameters, and > > other > > > configuration. > > > > - They do not contain secrets, but only pointers to secrets in the > > > SecretStore. > > > > > > > > 3) Connection DDL > > > > > > > > CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH ( > > > > 'type' = 'basic' | 'bearer' | 'jwt' | 'oauth' | ..., > > > > ... > > > > ) > > > > > > > > - Connection type is pluggable and discovered using the regular > > > factory-approach. > > > > - The factory extracts secrets and puts them into SecretStore. > > > > - The factory only leaves non-confidential options left that can be > > > stored in a catalog. > > > > > > > > When executing: > > > > CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH ( > > > > 'type' = 'basic', > > > > 'url' = 'api.example.com', > > > > 'username' = 'bob', > > > > 'password' = 'xyz' > > > > ) > > > > > > > > The catalog will receive something similar to: > > > > CREATE [TEMPORARY] CONNECTION mycat.mydb.OpenAPI WITH ( > > > > 'type' = 'basic', > > > > 'url' = 'api.example.com', > > > > 'secret.store' = 'azure-key-vault' > > > > 'secret.id' = 'secretId' > > > > ) > > > > > > > > - However, the exact property design is up to the connection factory. > > > > > > > > 4) Connection Usage > > > > > > > > CREATE TABLE t (...) USING CONNECTION mycat.mydb.OpenAPI; > > > > > > > > - MODEL, FUNCTION, TABLE DDL will support USING CONNECTION keyword > > > similar to BigQuery. > > > > - The connection will be provided in a table/model provider/function > > > definition factory. > > > > > > > > 5) CatalogStore / Catalog Initialization > > > > > > > > Catalog store or catalog can make use of SecretStore to retrieve > > initial > > > credentials for bootstrapping. All objects lower then catalog > > store/catalog > > > can then use connections. If you think we still need system level > > > connections, we can support CREATE SYSTEM CONNECTION GlobalName WITH > (..) > > > similar to SYSTEM functions directly store in a ConnectioManager in > > > TableEnvironment. But for now I would suggest to start simple with > > > per-catalog connections and later evolve the design. > > > > > > > > Dealing with secrets is a very sensitive topic and I'm clearly not an > > > expert on it. This is why we should try to push the problem to existing > > > solutions and don't start storing secrets in Flink in any way. Thus, > the > > > interfaces will be defined very generic. > > > > > > > > Looking forward to your feedback. > > > > > > > > Cheers, > > > > Timo > > > > > > > > > > > > > > > > > > > > > > > > On 09.06.25 04:01, Leonard Xu wrote: > > > >> Thanks Timo for joining this thread. > > > >> I agree that this feature is needed by the community; the current > > > disagreement is only about the implementation method or solution. > > > >> Your thoughts looks generally good to me, looking forward to your > > > proposal. > > > >> Best, > > > >> Leonard > > > >>> 2025 6月 6 22:46,Timo Walther <twal...@apache.org> 写道: > > > >>> > > > >>> Hi everyone, > > > >>> > > > >>> thanks for this healthy discussion. Looking at high number of > > > participants, it looks like we definitely want this feature. We just > need > > > to figure out the "how". > > > >>> > > > >>> This reminds me very much of the discussion we had for CREATE > > > FUNCTION. There, we discussed whether functions should be named > globally > > or > > > catalog-specific. In the end, we decided for both `CREATE SYSTEM > > FUNCTION` > > > and `CREATE FUNCTION`, satisfying both the data platform team of an > > > organization (which might provide system functions) and individual data > > > teams or use cases (scoped by catalog/database). > > > >>> > > > >>> Looking at other modern vendors like Snowflake there is SECRET > > (scoped > > > to schema) [1] and API INTEGRATION [2] (scoped to account). So also > other > > > vendors offer global and per-team / per-use case connections details. > > > >>> > > > >>> In general, I think fitting connections into the existing concepts > > for > > > catalog objects (with three-part identifier) makes managing them > easier. > > > But I also see the need for global defaults. > > > >>> > > > >>> Btw keep in mind that a catalog implementation should only store > > > metadata. Similar how a CatalogTable doesn't store the actual data, a > > > CatalogConnection should not store the credentials. It should only > offer > > a > > > factory that allows for storing and retrieving them. In real world > > > scenarios a factory is most likely backed by a product like Azure Key > > Vault. > > > >>> > > > >>> So code-wise having a ConnectionManager that behaves similar to > > > FunctionManager sounds reasonable. > > > >>> > > > >>> +1 for having special syntax instead of using properties. This > allows > > > to access connections in tables, models, functions. And catalogs, if we > > > agree to have global ones as well. > > > >>> > > > >>> What do you think? > > > >>> > > > >>> Let me spend some more thoughts on this and come back with a > concrete > > > proposal by early next week. > > > >>> > > > >>> Cheers, > > > >>> Timo > > > >>> > > > >>> [1] https://docs.snowflake.com/en/sql-reference/sql/create-secret > > > >>> [2] > > > https://docs.snowflake.com/en/sql-reference/sql/create-api-integration > > > >>> > > > >>> On 04.06.25 10:47, Leonard Xu wrote: > > > >>>> Hey,Mayank > > > >>>> Please see my feedback as following: > > > >>>> 1. One of the motivations of this FLIP is to improve security. > > > However, the current design stores all connection information in the > > > catalog, > > > >>>> and each Flink SQL job reads from the catalog during compilation. > > The > > > connection information is passed between SQL Gateway and the > > > >>>> catalog in plaintext, which actually introduces new security > risks. > > > >>>> 2. The name "Connection" should be changed to something like > > > ConnectionSpec to clearly indicate that it is a object containing only > > > static > > > >>>> properties without a lifecycle. Putting aside the naming issue, I > > > think the current model and hierarchy design is somewhat strange. > Storing > > > >>>> various kinds of connections (e.g., Kafka, MySQL) in the same > > Catalog > > > with hierarchical identifiers like catalog-name.db-name.connection-name > > > >>>> raises the following questions: > > > >>>> (1) What is the purpose of this hierarchical structure of > Connection > > > object ? > > > >>>> (2) If we can use a Connection to create a MySQL table, why can't > we > > > use a Connection to create a MySQL Catalog? > > > >>>> 3. Regarding the connector usage examples given in this FLIP: > > > >>>> ```sql > > > >>>> 1 -- Example 2: Using connection for jdbc tables > > > >>>> 2 CREATE OR REPLACE CONNECTION mysql_customer_db > > > >>>> 3 WITH ( > > > >>>> 4 'type' = 'jdbc', > > > >>>> 5 'jdbc.url' = 'jdbc:mysql:// > > > customer-db.example.com:3306/customerdb', > > > >>>> 6 'jdbc.connection.ssl.enabled' = 'true' > > > >>>> 7 ); > > > >>>> 8 > > > >>>> 9 CREATE TABLE customers ( > > > >>>> 10 customer_id INT, > > > >>>> 11 PRIMARY KEY (customer_id) NOT ENFORCED > > > >>>> 12 ) WITH ( > > > >>>> 13 'connector' = 'jdbc', > > > >>>> 14 'jdbc.connection' = 'mysql_customer_db', > > > >>>> 15 'jdbc.connection.ssl.enabled' = 'true', > > > >>>> 16 'jdbc.connection.max-retry-timeout' = '60s', > > > >>>> 17 'jdbc.table-name' = 'customers', > > > >>>> 18 'jdbc.lookup.cache' = 'PARTIAL' > > > >>>> 19 ); > > > >>>> ``` > > > >>>> I see three issues from SQL semantics and Connector compatibility > > > perspectives: > > > >>>> (1) Look at line 14: `mysql_customer_db` is an object identifier > of > > a > > > CONNECTION defined in SQL. However, this identifier is referenced > > > >>>> via a string value inside the table’s WITH clause, which feel > > > hack for me. > > > >>>> (2) Look at lines 14–16: the use of the specific prefix > > > `jdbc.connection` will confuse users because `connection.xx` maybe > > already > > > used as > > > >>>> a prefix for existing configuration items. > > > >>>> (3) Look at lines 14–18: Why do all existing configuration options > > > need to be prefixed with `jdbc`, even they’re not related to Connection > > > properties? > > > >>>> This completely changes user habits — is it backward compatible? > > > >>>> In my opinion, Connection should be a model independent of both > > > Catalog and Table, and can be referenced by all catalog/table/udf/model > > > object. > > > >>>> It should be managed by a Component such as a ConnectionManager to > > > enable reuse. For security purposes, authentication mechanisms could > > > >>>> be supported within the ConnectionManager. > > > >>>> Best, > > > >>>> Leonard > > > >>>>> 2025 6月 4 02:04,Martijn Visser <martijnvis...@apache.org> 写道: > > > >>>>> > > > >>>>> Hi all, > > > >>>>> > > > >>>>> First of all, I think having a Connection resource is something > > that > > > will > > > >>>>> be beneficial for Apache Flink. I could see that being extended > in > > > the > > > >>>>> future to allow for easier secret handling [1]. > > > >>>>> In my mental mind, I'm comparing this proposal against SQL/MED > from > > > the ISO > > > >>>>> standard [2]. I do think that SQL/MED isn't a very user friendly > > > syntax > > > >>>>> though, looking at Postgres for example [3]. > > > >>>>> > > > >>>>> I think it's a valid question if Connection should be considered > > > with a > > > >>>>> catalog or database-level scope. @Ryan can you share something > > more, > > > since > > > >>>>> you've mentioned "Note: I much prefer catalogs for this case. > Which > > > is what > > > >>>>> we use internally to manage connection properties". It looks like > > > there > > > >>>>> isn't a strong favourable approach looking at other vendors > (like, > > > >>>>> Databricks does scopes it on a Unity catalog, Snowflake on a > > database > > > >>>>> level). > > > >>>>> > > > >>>>> Also looking forward to Leonard's input. > > > >>>>> > > > >>>>> Best regards, > > > >>>>> > > > >>>>> Martijn > > > >>>>> > > > >>>>> [1] https://issues.apache.org/jira/browse/FLINK-36818 > > > >>>>> [2] https://www.iso.org/standard/84804.html > > > >>>>> [3] > https://www.postgresql.org/docs/current/sql-createserver.html > > > >>>>> > > > >>>>> On Fri, May 30, 2025 at 5:07 AM Leonard Xu <xbjt...@gmail.com> > > > wrote: > > > >>>>> > > > >>>>>> Hey Mayank. > > > >>>>>> > > > >>>>>> Thanks for the FLIP, I went through this FLIP quickly and found > > some > > > >>>>>> issues which I think we > > > >>>>>> need to deep discuss later. As we’re on a short Dragon boat > > > Festival, > > > >>>>>> could you kindly hold > > > >>>>>> on this thread? and we will back to continue the FLIP discuss. > > > >>>>>> > > > >>>>>> Best, > > > >>>>>> Leonard > > > >>>>>> > > > >>>>>> > > > >>>>>>> 2025 4月 29 23:07,Mayank Juneja <mayankjunej...@gmail.com> 写道: > > > >>>>>>> > > > >>>>>>> Hi all, > > > >>>>>>> > > > >>>>>>> I would like to open up for discussion a new FLIP-529 [1]. > > > >>>>>>> > > > >>>>>>> Motivation: > > > >>>>>>> Currently, Flink SQL handles external connectivity by defining > > > endpoints > > > >>>>>>> and credentials in table configuration. This approach prevents > > > >>>>>> reusability > > > >>>>>>> of these connections and makes table definition less secure by > > > exposing > > > >>>>>>> sensitive information. > > > >>>>>>> We propose the introduction of a new "connection" resource in > > > Flink. This > > > >>>>>>> will be a pluggable resource configured with a remote endpoint > > and > > > >>>>>>> associated access key. Once defined, connections can be reused > > > across > > > >>>>>> table > > > >>>>>>> definitions, and eventually for model definition (as discussed > in > > > >>>>>> FLIP-437) > > > >>>>>>> for inference, enabling seamless and secure integration with > > > external > > > >>>>>>> systems. > > > >>>>>>> The connection resource will provide a new, optional way to > > manage > > > >>>>>> external > > > >>>>>>> connectivity in Flink. Existing methods for table definitions > > will > > > remain > > > >>>>>>> unchanged. > > > >>>>>>> > > > >>>>>>> [1] https://cwiki.apache.org/confluence/x/cYroF > > > >>>>>>> > > > >>>>>>> Best Regards, > > > >>>>>>> Mayank Juneja > > > >>>>>> > > > >>>>>> > > > >>> > > > > > > > > > > > > > > -- > > *Mayank Juneja* > > Product Manager | Data Streaming and AI > > >