[
https://issues.apache.org/jira/browse/FLINK-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279761#comment-17279761
]
Piotr Nowojski edited comment on FLINK-21307 at 2/5/21, 3:40 PM:
-----------------------------------------------------------------
I have a couple of concerns here.
1. I think currently there are a couple of bugs. For example at the end of the
{{StreamTask#triggerCheckpoint}}, {{FlinkSecurityManager}} is being disabled,
while this method is invoked from the task/mailbox thread. So it would disable
{{FlinkSecurityManager}} for any subsequent mailbox actions and record
processing which is happening in the task/mailbox thread (virtually all user
code invocation are happening there). At the very least, we should never turn
off/turn on {{FlinkSecurityManager}}, but we should be restoring pre-existing
setting (if it has already been enabled before calling
{{StreamTask#triggerCheckpoint}}, we should leave it enabled when leaving
{{StreamTask#triggerCheckpoint}}.
2. Is {{FlinkSecurityManager}} enabled in {{LegacySourceFunctionThread}}?
3. This also shows a bit fragile contract, where we are enabling/disabling
{{FlinkSecurityManager}} in random places. I would propose to change this to
either:
a) enable it only just before touching the user code (might be complicated
and cause some overhead in the critical per-record code paths)
b) enable it always whenever we are creating a new thread, that might be
executing user's code (task canceller thread, task thread,
LegacySourceFunctionThread, ...?), and manually/explicitly disable/bypass it
whenever we want Flink to exit via {{System.exit()}}. For example FLINK-21306.
In other words we would assume that all {{System.exit()}} calls are coming from
the user code, unless they are issued from {{FlinkSecurityManager}}.
Is there a way maybe to enable {{FlinkSecurityManager}} by default for every
thread? If so, that would help us to handle custom threads spawned by operators
(sources, sinks, {{AsyncWaitOperator}}, ...)/3rd party libraries (S3, ...) etc.
was (Author: pnowojski):
I have two concerns here.
1. I think currently there are a couple of bugs. For example at the end of the
{{StreamTask#triggerCheckpoint}}, {{FlinkSecurityManager}} is being disabled,
while this method is invoked from the task/mailbox thread. So it would disable
{{FlinkSecurityManager}} for any subsequent mailbox actions and record
processing which is happening in the task/mailbox thread (virtually all user
code invocation are happening there). At the very least, we should never turn
off/turn on {{FlinkSecurityManager}}, but we should be restoring pre-existing
setting (if it has already been enabled before calling
{{StreamTask#triggerCheckpoint}}, we should leave it enabled when leaving
{{StreamTask#triggerCheckpoint}}.
2. Is {{FlinkSecurityManager}} enabled in {{LegacySourceFunctionThread}}?
3. This also shows a bit fragile contract, where we are enabling/disabling
{{FlinkSecurityManager}} in random places. I would propose to change this to
either:
a) enable it only just before touching the user code (might be complicated
and cause some overhead in the critical per-record code paths)
b) enable it always whenever we are creating a new thread, that might be
executing user's code (task canceller thread, task thread,
LegacySourceFunctionThread, ...?), and manually/explicitly disable/bypass it
whenever we want Flink to exit via {{System.exit()}}. For example FLINK-21306.
In other words we would assume that all {{System.exit()}} calls are coming from
the user code, unless they are issued from {{FlinkSecurityManager}}.
Is there a way maybe to enable {{FlinkSecurityManager}} by default for every
thread? If so, that would help us to handle custom threads spawned by operators
(sources, sinks, {{AsyncWaitOperator}}, ...)/3rd party libraries (S3, ...) etc.
> Revisit activation model of FlinkSecurityManager
> ------------------------------------------------
>
> Key: FLINK-21307
> URL: https://issues.apache.org/jira/browse/FLINK-21307
> Project: Flink
> Issue Type: Bug
> Components: Runtime / Task
> Affects Versions: 1.13.0
> Reporter: Robert Metzger
> Priority: Critical
> Fix For: 1.13.0
>
>
> In FLINK-15156, we introduced a feature that allows users to log or
> completely disable calls to System.exit(). This feature is enabled for
> certain threads / code sections intended to execute user-code.
> The activation of the security manager (for monitoring user calls to
> System.exit() is currently not well-defined, and only implemented on a
> best-effort basis.
> This ticket is to revisit the activation.
--
This message was sent by Atlassian Jira
(v8.3.4#803005)