Hi Ferenc, I think this is indeed an important topic to have a discussion about. It probably also makes sense to create a FLIP and have a vote on this.
One potential option that we could have is to have a generic option to refer to external data when specifying a connector option, like ${file:path:key}. If you have an example like this: CREATE TABLE mytable ( ... ) WITH ( 'connector' = 'kafka', 'properties.bootstrap.servers' = '...:9092', 'topic' = '...', 'properties.ssl.keystore.password' = <password>, 'properties.ssl.keystore.location' = ..., 'properties.ssl.truststore.password' = <password>, 'properties.ssl.truststore.location' = ..., ); You would end up with something like CREATE TABLE mytable ( ... ) WITH ( 'connector' = 'kafka', 'properties.bootstrap.servers' = '${file:/connect/kafka.properties:BOOTSTRAP_SERVERS', 'topic' = '...', 'properties.ssl.keystore.password' = '${file:/credentials/credential.properties:KEYSTORE_PASSWORD', 'properties.ssl.keystore.location' = '${file:/credentials/keystore.jks', 'properties.ssl.truststore.password' = '${file:/credentials/credential.properties:TRUSTSTORE_PASSWORD', 'properties.ssl.truststore.location' = '${file:/credentials/truststore.jks' ); With the bootstrap.servers defined in a kafka.properties file and credentials in a credentials.properties file. You could also directly refer to things like a keystore file. Would be interesting to understand how your internal solution looks. It would probably also be a good idea to have a look at how others have solved this problem. And last but not least, we also need to think how potentially auto-rotated secrets would be made available for SQL users. Best regards, Martijn On Fri, Dec 2, 2022 at 1:37 AM Ferenc Csaky <ferenc.cs...@pm.me.invalid> wrote: > Hello devs, > > I'd like to revive this discussion. There is also a ticket about this > effort for some time [1] and this thing also affects us as well. Right now > we have a custom solution that is similar to "environment variables", but > it only can be used in parts of our downstream product. The main thing for > us to achieve would be to be able to use variables in DDLs (not necessarily > for hiding sensitive props). I think it would be really handy to have the > ability to reuse values in multiple tables. > > With that said, comes the temptation to hit two birds with one stone, > although a sensitive property requires much more care than a regular one, > so I think these two things should be handled separately. At least in the > beginning. The tricky part of the "environment variables" are their scope, > and if they are not coming from an external system, it will probably be > necessary to persist them. Or keep them in memory, but that may be > insufficient according to what is the scope of the "environment variables". > > Considering the sensitive props, I think a small step forward could be to > hide the values in case of a "SHOW CREATE TABLE" op. > > For a varible to be used in a DDL I'd imagine it could apply for a whole > catalog as starters. As long as the catalog is present, those variables > would be valid. > > I did not check implementation details yet, so it is possible I'm missing > something important or wrong in some places, but I wanted to get some > feedback about the idea. > > WDYT? > > [1] https://issues.apache.org/jira/browse/FLINK-28028 > > Best, > F > > > ------- Original Message ------- > On Monday, April 4th, 2022 at 09:53, Timo Walther <twal...@apache.org> > wrote: > > > > > > > > Hi Fred, > > > > thanks for starting this discussion. I totally agree that this an issue > > that the community should solve. It popped up before and is still > > unsolved today. Great that you offer your help here. So let's clarify > > the implementation details. > > > > 1) Global vs. Local solution > > > > Is this a DDL-only problem? If yes, it would be easier to solve it in > > the `FactoryUtil` that all Flink connectors and formats use. > > > > 2) Configruation vs. enviornment variables > > > > I agree with Qingsheng that environment variable are not always > > straightforward to identify if you have a "pre-flight phase" and a > > "cluster phase". > > In the DynamicTableFactory, one has access to Flink configuration and > > could resolve `${...}` variables. > > > > > > What do you think? > > > > Regards, > > Timo > > > > > > Am 01.04.22 um 12:26 schrieb Qingsheng Ren: > > > > > Hi Fred, > > > > > > Thanks for raising the discussion! I think the definition of > “environment variable” varies under different context. Under Flink on K8s > it means the environment variable for a container, and if you are a SQL > client user it could refer to environment variable of SQL client, or even > the system properties on JVM. So using “environment variable” is a bit > vague under different environments. > > > > > > A more generic solution in my mind is that we can take advantage of > configurations in Flink, to pass table options dynamically by adding > configs to TableConfig or even flink-conf.yaml. For example option > “table.dynamic.options.my_catalog.my_db_.my_table.accessId = foo” means > adding table option “accessId = foo” to table “my_catalog.my_db.my_table”. > By this way we could de-couple DDL statement with table options containing > secret credentials. What do you think? > > > > > > Best regards, > > > > > > Qingsheng > > > > > > > On Mar 30, 2022, at 16:25, Teunissen, F.G.J. (Fred) > fred.teunis...@ing.com.INVALID wrote: > > > > > > > > Hi devs, > > > > > > > > Some SQL Table properties contain sensitive data, like passwords > that we do not want to expose in the VVP ui to other users. Also, having > them clear text in a SQL statement is not secure. For example, > > > > > > > > CREATE TABLE Orders ( > > > > `user` BIGINT, > > > > product STRING, > > > > order_time TIMESTAMP(3) > > > > ) WITH ( > > > > 'connector' = 'kafka', > > > > > > > > 'properties.bootstrap.servers' = > 'kafka-host-1:9093,kafka-host-2:9093', > > > > 'properties.security.protocol' = 'SSL', > > > > 'properties.ssl.key.password' = 'should-be-a-secret', > > > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks', > > > > 'properties.ssl.keystore.password' = 'should-also-be-a-secret', > > > > 'properties.ssl.truststore.location' = > '/tmp/secrets/my-truststore.jks', > > > > 'properties.ssl.truststore.password' = 'should-again-be-a-secret', > > > > 'scan.startup.mode' = 'earliest-offset' > > > > ); > > > > > > > > I would like to bring up for a discussion a proposal to provide > these secrets values via environment variables since these can be populated > from a K8s configMap or secrets. > > > > > > > > For implementing the SQL Table properties, the ConfigOption<T> class > is used in connectors and formatters. This class could be extended that it > checks whether the config-value contains certain tokens, like > ‘${env-var-name}’. If it does, it could fetch the value from the > environment variable and use that to replace that token in the config-value. > > > > > > > > The above SQL statement would then look like, > > > > > > > > CREATE TABLE Orders ( > > > > `user` BIGINT, > > > > product STRING, > > > > order_time TIMESTAMP(3) > > > > ) WITH ( > > > > 'connector' = 'kafka', > > > > > > > > 'properties.bootstrap.servers' = > 'kafka-host-1:9093,kafka-host-2:9093', > > > > 'properties.security.protocol' = 'SSL', > > > > 'properties.ssl.key.password' = '${secret_kafka_ssl_key_password}', > > > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks', > > > > 'properties.ssl.keystore.password' = > '${secret_kafka_ssl_keystore_password}', > > > > 'properties.ssl.truststore.location' = > '/tmp/secrets/my-truststore.jks', > > > > 'properties.ssl.truststore.password' = > '${secret_kafka_ssl_truststore_password}', > > > > 'scan.startup.mode' = 'earliest-offset' > > > > ); > > > > > > > > For the purpose of secrets I don’t think you need any complex > processing of tokens but perhaps there are other usages as well. For > instance, > > > > > > > > 'properties.bootstrap.servers' = > 'kafka-${otap_env}-1:9093,kafka-${otap_env}-2:9093', > > > > > > > > Because it is possible that (but I think unlikely) someone wants a > property value like ‘${not-an-env-var}’ you need to be able to escape this > ’$’ token like ‘$${not-an-env-var}’. This also means that in theory it > would break compatibility. > > > > > > > > Looking forward for your feedback! > > > > > > > > Best, > > > > Fred Teunissen > > > > > > > > ----------------------------------------------------------------- > > > > ATTENTION: > > > > The information in this e-mail is confidential and only meant for > the intended recipient. If you are not the intended recipient, don't use or > disclose it in any way. Please let the sender know and delete the message > immediately. > > > > ----------------------------------------------------------------- > > > > >