Hi Ferenc,

I think this is indeed an important topic to have a discussion about. It
probably also makes sense to create a FLIP and have a vote on this.

One potential option that we could have is to have a generic option to
refer to external data when specifying a connector option,
like ${file:path:key}. If you have an example like this:

CREATE TABLE mytable (
...
)
WITH (
'connector' = 'kafka',
'properties.bootstrap.servers' = '...:9092',
'topic' = '...',
'properties.ssl.keystore.password' = <password>,
'properties.ssl.keystore.location' = ...,
'properties.ssl.truststore.password' = <password>,
'properties.ssl.truststore.location' = ...,
);

You would end up with something like

CREATE TABLE mytable (
...
)
WITH (
'connector' = 'kafka',
'properties.bootstrap.servers' =
'${file:/connect/kafka.properties:BOOTSTRAP_SERVERS',
'topic' = '...',
'properties.ssl.keystore.password' =
'${file:/credentials/credential.properties:KEYSTORE_PASSWORD',
'properties.ssl.keystore.location' = '${file:/credentials/keystore.jks',
'properties.ssl.truststore.password' =
'${file:/credentials/credential.properties:TRUSTSTORE_PASSWORD',
'properties.ssl.truststore.location' = '${file:/credentials/truststore.jks'
);

With the bootstrap.servers defined in a kafka.properties file and
credentials in a credentials.properties file. You could also directly refer
to things like a keystore file.

Would be interesting to understand how your internal solution looks. It
would probably also be a good idea to have a look at how others have solved
this problem. And last but not least, we also need to think how potentially
auto-rotated secrets would be made available for SQL users.

Best regards,

Martijn

On Fri, Dec 2, 2022 at 1:37 AM Ferenc Csaky <ferenc.cs...@pm.me.invalid>
wrote:

> Hello devs,
>
> I'd like to revive this discussion. There is also a ticket about this
> effort for some time [1] and this thing also affects us as well. Right now
> we have a custom solution that is similar to "environment variables", but
> it only can be used in parts of our downstream product. The main thing for
> us to achieve would be to be able to use variables in DDLs (not necessarily
> for hiding sensitive props). I think it would be really handy to have the
> ability to reuse values in multiple tables.
>
> With that said, comes the temptation to hit two birds with one stone,
> although a sensitive property requires much more care than a regular one,
> so I think these two things should be handled separately. At least in the
> beginning. The tricky part of the "environment variables" are their scope,
> and if they are not coming from an external system, it will probably be
> necessary to persist them. Or keep them in memory, but that may be
> insufficient according to what is the scope of the "environment variables".
>
> Considering the sensitive props, I think a small step forward could be to
> hide the values in case of a "SHOW CREATE TABLE" op.
>
> For a varible to be used in a DDL I'd imagine it could apply for a whole
> catalog as starters. As long as the catalog is present, those variables
> would be valid.
>
> I did not check implementation details yet, so it is possible I'm missing
> something important or wrong in some places, but I wanted to get some
> feedback about the idea.
>
> WDYT?
>
> [1] https://issues.apache.org/jira/browse/FLINK-28028
>
> Best,
> F
>
>
> ------- Original Message -------
> On Monday, April 4th, 2022 at 09:53, Timo Walther <twal...@apache.org>
> wrote:
>
>
> >
> >
> > Hi Fred,
> >
> > thanks for starting this discussion. I totally agree that this an issue
> > that the community should solve. It popped up before and is still
> > unsolved today. Great that you offer your help here. So let's clarify
> > the implementation details.
> >
> > 1) Global vs. Local solution
> >
> > Is this a DDL-only problem? If yes, it would be easier to solve it in
> > the `FactoryUtil` that all Flink connectors and formats use.
> >
> > 2) Configruation vs. enviornment variables
> >
> > I agree with Qingsheng that environment variable are not always
> > straightforward to identify if you have a "pre-flight phase" and a
> > "cluster phase".
> > In the DynamicTableFactory, one has access to Flink configuration and
> > could resolve `${...}` variables.
> >
> >
> > What do you think?
> >
> > Regards,
> > Timo
> >
> >
> > Am 01.04.22 um 12:26 schrieb Qingsheng Ren:
> >
> > > Hi Fred,
> > >
> > > Thanks for raising the discussion! I think the definition of
> “environment variable” varies under different context. Under Flink on K8s
> it means the environment variable for a container, and if you are a SQL
> client user it could refer to environment variable of SQL client, or even
> the system properties on JVM. So using “environment variable” is a bit
> vague under different environments.
> > >
> > > A more generic solution in my mind is that we can take advantage of
> configurations in Flink, to pass table options dynamically by adding
> configs to TableConfig or even flink-conf.yaml. For example option
> “table.dynamic.options.my_catalog.my_db_.my_table.accessId = foo” means
> adding table option “accessId = foo” to table “my_catalog.my_db.my_table”.
> By this way we could de-couple DDL statement with table options containing
> secret credentials. What do you think?
> > >
> > > Best regards,
> > >
> > > Qingsheng
> > >
> > > > On Mar 30, 2022, at 16:25, Teunissen, F.G.J. (Fred)
> fred.teunis...@ing.com.INVALID wrote:
> > > >
> > > > Hi devs,
> > > >
> > > > Some SQL Table properties contain sensitive data, like passwords
> that we do not want to expose in the VVP ui to other users. Also, having
> them clear text in a SQL statement is not secure. For example,
> > > >
> > > > CREATE TABLE Orders (
> > > > `user` BIGINT,
> > > > product STRING,
> > > > order_time TIMESTAMP(3)
> > > > ) WITH (
> > > > 'connector' = 'kafka',
> > > >
> > > > 'properties.bootstrap.servers' =
> 'kafka-host-1:9093,kafka-host-2:9093',
> > > > 'properties.security.protocol' = 'SSL',
> > > > 'properties.ssl.key.password' = 'should-be-a-secret',
> > > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
> > > > 'properties.ssl.keystore.password' = 'should-also-be-a-secret',
> > > > 'properties.ssl.truststore.location' =
> '/tmp/secrets/my-truststore.jks',
> > > > 'properties.ssl.truststore.password' = 'should-again-be-a-secret',
> > > > 'scan.startup.mode' = 'earliest-offset'
> > > > );
> > > >
> > > > I would like to bring up for a discussion a proposal to provide
> these secrets values via environment variables since these can be populated
> from a K8s configMap or secrets.
> > > >
> > > > For implementing the SQL Table properties, the ConfigOption<T> class
> is used in connectors and formatters. This class could be extended that it
> checks whether the config-value contains certain tokens, like
> ‘${env-var-name}’. If it does, it could fetch the value from the
> environment variable and use that to replace that token in the config-value.
> > > >
> > > > The above SQL statement would then look like,
> > > >
> > > > CREATE TABLE Orders (
> > > > `user` BIGINT,
> > > > product STRING,
> > > > order_time TIMESTAMP(3)
> > > > ) WITH (
> > > > 'connector' = 'kafka',
> > > >
> > > > 'properties.bootstrap.servers' =
> 'kafka-host-1:9093,kafka-host-2:9093',
> > > > 'properties.security.protocol' = 'SSL',
> > > > 'properties.ssl.key.password' = '${secret_kafka_ssl_key_password}',
> > > > 'properties.ssl.keystore.location' = '/tmp/secrets/my-keystore.jks',
> > > > 'properties.ssl.keystore.password' =
> '${secret_kafka_ssl_keystore_password}',
> > > > 'properties.ssl.truststore.location' =
> '/tmp/secrets/my-truststore.jks',
> > > > 'properties.ssl.truststore.password' =
> '${secret_kafka_ssl_truststore_password}',
> > > > 'scan.startup.mode' = 'earliest-offset'
> > > > );
> > > >
> > > > For the purpose of secrets I don’t think you need any complex
> processing of tokens but perhaps there are other usages as well. For
> instance,
> > > >
> > > > 'properties.bootstrap.servers' =
> 'kafka-${otap_env}-1:9093,kafka-${otap_env}-2:9093',
> > > >
> > > > Because it is possible that (but I think unlikely) someone wants a
> property value like ‘${not-an-env-var}’ you need to be able to escape this
> ’$’ token like ‘$${not-an-env-var}’. This also means that in theory it
> would break compatibility.
> > > >
> > > > Looking forward for your feedback!
> > > >
> > > > Best,
> > > > Fred Teunissen
> > > >
> > > > -----------------------------------------------------------------
> > > > ATTENTION:
> > > > The information in this e-mail is confidential and only meant for
> the intended recipient. If you are not the intended recipient, don't use or
> disclose it in any way. Please let the sender know and delete the message
> immediately.
> > > > -----------------------------------------------------------------
> >
> >
>

Reply via email to