rickchengx commented on PR #11589: URL: https://github.com/apache/dolphinscheduler/pull/11589#issuecomment-1236633109
> May I ask whether we could do this with some other ways in `sqoop task plugin` instead of add a specific condition here for `sqoop`? ShellCommandExecutor is shared across different task plugins, it might not be a good practice and uneasy to maintain if we need to change it every time we need to mask something for a task plugin. The same for line 55-56. Hi, @EricGao888. Thanks for your suggestions. I have moved the masking logic to a common util class. There are 2 possible locations that the log may contain the sensitive data (e.g., password). Taking the sqoop task as an example: 1. The log output in the `buildCommand()` in `SqoopTask`. 2. The log output in the `createCommandFileIfNotExists(String execCommand, String commandFile)` in `ShellCommandExecutor`. We need to mask the sensitive data of `execCommand` here since the `execCommand` will be output in the log. But this class is shared by multiple task plugins and how to mask the sensitive data needs to be discussed. As for me, there are 2 ways: 1. Use a common Utils class to mask the sensitive data of `execCommand` as below: `SensitiveUtils.maskSensitiveForExecCommand(execCommand)` So multiple task plugin (use ShellCommandExecutor) will use a comman method to mask the sensitive data of `execCommand`. Note that the command after masking will only used to output in the log, so this method will not affect the actual command executed. 2. Refactor the `createCommandFileIfNotExists()` and add a param `String execCommandMasking`. Also `AbstractCommandExecutor.run(execCommand)` needs to be refactored to `AbstractCommandExecutor.run(execCommand, execCommandMasking)`. By doing this, the process logic of how to mask the sensitive data can be done in each task plugin. But this will modify `AbstractCommandExecutor` and `ShellCommandExecutor`, and many task plugins need to be modified. Currently, this PR uses the first way to mask the sensitive data. And I also sent a related discussion email to the mailing list. So which way do you think is better? Or there are other better ways to do so. Any comments or suggestions are welcome! -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
