[jira] [Comment Edited] (FLINK-21307) Revisit activation model of FlinkSecurityManager

Piotr Nowojski (Jira) Fri, 05 Feb 2021 07:41:09 -0800


    [ 
https://issues.apache.org/jira/browse/FLINK-21307?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=17279761#comment-17279761
 ]


Piotr Nowojski edited comment on FLINK-21307 at 2/5/21, 3:40 PM:
-----------------------------------------------------------------

I have a couple of concerns here.

1. I think currently there are a couple of bugs. For example at the end of the 
{{StreamTask#triggerCheckpoint}}, {{FlinkSecurityManager}} is being disabled, 
while this method is invoked from the task/mailbox thread. So it would disable 
{{FlinkSecurityManager}} for any subsequent mailbox actions and record 
processing which is happening in the task/mailbox thread (virtually all user 
code invocation are happening there). At the very least, we should never turn 
off/turn on {{FlinkSecurityManager}}, but we should be restoring pre-existing 
setting (if it has already been enabled before calling 
{{StreamTask#triggerCheckpoint}}, we should leave it enabled when leaving 
{{StreamTask#triggerCheckpoint}}.

2. Is {{FlinkSecurityManager}} enabled in {{LegacySourceFunctionThread}}?

3. This also shows a bit fragile contract, where we are enabling/disabling 
{{FlinkSecurityManager}} in random places. I would propose to change this to 
either:
  a) enable it only just before touching the user code (might be complicated 
and cause some overhead in the critical per-record code paths)
  b) enable it always whenever we are creating a new thread, that might be 
executing user's code (task canceller thread, task thread, 
LegacySourceFunctionThread, ...?), and manually/explicitly disable/bypass it 
whenever we want Flink to exit via {{System.exit()}}. For example FLINK-21306. 
In other words we would assume that all {{System.exit()}} calls are coming from 
the user code, unless they are issued from {{FlinkSecurityManager}}. 

Is there a way maybe to enable {{FlinkSecurityManager}} by default for every 
thread? If so, that would help us to handle custom threads spawned by operators 
(sources, sinks, {{AsyncWaitOperator}}, ...)/3rd party libraries (S3, ...) etc.


was (Author: pnowojski):
I have two concerns here.

1. I think currently there are a couple of bugs. For example at the end of the 
{{StreamTask#triggerCheckpoint}}, {{FlinkSecurityManager}} is being disabled, 
while this method is invoked from the task/mailbox thread. So it would disable 
{{FlinkSecurityManager}} for any subsequent mailbox actions and record 
processing which is happening in the task/mailbox thread (virtually all user 
code invocation are happening there). At the very least, we should never turn 
off/turn on {{FlinkSecurityManager}}, but we should be restoring pre-existing 
setting (if it has already been enabled before calling 
{{StreamTask#triggerCheckpoint}}, we should leave it enabled when leaving 
{{StreamTask#triggerCheckpoint}}.

2. Is {{FlinkSecurityManager}} enabled in {{LegacySourceFunctionThread}}?

3. This also shows a bit fragile contract, where we are enabling/disabling 
{{FlinkSecurityManager}} in random places. I would propose to change this to 
either:
  a) enable it only just before touching the user code (might be complicated 
and cause some overhead in the critical per-record code paths)
  b) enable it always whenever we are creating a new thread, that might be 
executing user's code (task canceller thread, task thread, 
LegacySourceFunctionThread, ...?), and manually/explicitly disable/bypass it 
whenever we want Flink to exit via {{System.exit()}}. For example FLINK-21306. 
In other words we would assume that all {{System.exit()}} calls are coming from 
the user code, unless they are issued from {{FlinkSecurityManager}}. 

Is there a way maybe to enable {{FlinkSecurityManager}} by default for every 
thread? If so, that would help us to handle custom threads spawned by operators 
(sources, sinks, {{AsyncWaitOperator}}, ...)/3rd party libraries (S3, ...) etc.

> Revisit activation model of FlinkSecurityManager
> ------------------------------------------------
>
>                 Key: FLINK-21307
>                 URL: https://issues.apache.org/jira/browse/FLINK-21307
>             Project: Flink
>          Issue Type: Bug
>          Components: Runtime / Task
>    Affects Versions: 1.13.0
>            Reporter: Robert Metzger
>            Priority: Critical
>             Fix For: 1.13.0
>
>
> In FLINK-15156, we introduced a feature that allows users to log or 
> completely disable calls to System.exit(). This feature is enabled for 
> certain threads / code sections intended to execute user-code.
> The activation of the security manager (for monitoring user calls to 
> System.exit() is currently not well-defined, and only implemented on a 
> best-effort basis.
> This ticket is to revisit the activation.



--
This message was sent by Atlassian Jira
(v8.3.4#803005)

[jira] [Comment Edited] (FLINK-21307) Revisit activation model of FlinkSecurityManager

Reply via email to