Hi Arvid,

I see what you mean; no solution in Flink will be able to account for the
different variations in which applications may want to pass in parameters
or the external processes or events that introspect wherever the Flink
process happens to run. I do think there is an opportunity to prevent
logging secrets by focusing on a couple of areas. The reason I think we
should improve where we can is because logs can end up in systems that a
greater number of people have access to. For example, in a given
environment, perhaps only automated systems have the ability to deploy and
instropect the servers, but engineers across teams may have access to all
logs from that environment.

The areas where I think we can prevent logging secrets are:
1) Obfuscating JVM parameters
and
2) Apply the logic in ParameterTool's "fromArgs" method to parse out
arguments in the EnvironmentInformation class.

For example, one of the documented ways of passing in AWS credentials are
via JVM parameters,
https://docs.aws.amazon.com/sdk-for-java/v1/developer-guide/credentials.html
By leveraging ParameterTool's logic in the EnvironmentInformation class, we
can bridge the intent of the current code with how Flink's built-in
argument parser works.

On Thu, Jun 17, 2021 at 2:31 PM Arvid Heise <ar...@apache.org> wrote:

> Hi Jose,
>
> Masking secrets is a recurring topic where ultimately you won't find a
> good solution. Your secret might for example appear in a crash dump or on
> some process monitoring application. To mask reliably you'd either need
> specific application knowledge (every user supplies arguments differently)
> or disable logging of parameters completely.
>
> Frankly speaking, I have never seen passwords being passed over CLI being
> really secure. The industry practice is to either use a sidecar approach or
> fetch secrets file-based (e.g., docker mounts). Even using ENV is
> discouraged.
>
> On Wed, Jun 16, 2021 at 11:28 PM Jose Vargas <jose.var...@fiscalnote.com>
> wrote:
>
>> Hi,
>>
>> I am using Flink 1.13.1 and I noticed that the logs coming from the
>> EnvironmentInformation class,
>> https://github.com/apache/flink/blob/release-1.13.1/flink-runtime/src/main/java/org/apache/flink/runtime/util/EnvironmentInformation.java#L444-L467,
>> log the value of secrets that are passed in as JVM and CLI arguments. For
>> the JVM arguments, both the secret key and value are logged. For the CLI
>> arguments, the secret key is obfuscated, but the actual value of the secret
>> is not. This also affects Flink 1.12.
>>
>> For example, with CLI arguments like "--my-password VALUE_TO_HIDE", the
>> jobmanager will log the following (assuming cluster is in application mode)
>>
>> jobmanager     | ****** (sensitive information)
>> jobmanager     | VALUE_TO_HIDE
>>
>> The key is obfuscated but the actual value isn't. This means that secret
>> values can end up in central logging systems. Passing in the CLI argument
>> as "--my-password*=*VALUE_TO_HIDE" hides the entire string but makes the
>> value unusable and is different from how the docs mentions job arguments
>> should be passed in [1].
>>
>> I saw that there was a ticket to obfuscate secrets [2], but that seems to
>> only apply to the UI, not for the configuration logs. Turning off, or
>> otherwise disabling logs from the appropriate logger is one solution, but
>> it seems to me that the logger that a user would need to turn off is
>> dependent on how the Flink cluster is running (standalone, k8s, yarn,
>> mesos, etc). Furthermore, it can be useful to see these configuration logs.
>>
>>
>> [1]
>> https://ci.apache.org/projects/flink/flink-docs-release-1.13/docs/dev/datastream/application_parameters/#from-the-command-line-arguments
>> [2] https://issues.apache.org/jira/browse/FLINK-14047
>>
>> Thanks,
>> --
>>
>> Jose Vargas
>>
>> Software Engineer, Data Engineering
>>
>> E: jose.var...@fiscalnote.com
>>
>> fiscalnote.com <https://www.fiscalnote.com>  |  info.cq.com
>> <http://www.info.cq.com>  | rollcall.com <https://www.rollcall.com>
>>
>>

-- 

Jose Vargas

Software Engineer, Data Engineering

E: jose.var...@fiscalnote.com

fiscalnote.com <https://www.fiscalnote.com>  |  info.cq.com
<http://www.info.cq.com>  | rollcall.com <https://www.rollcall.com>

Reply via email to