[
https://issues.apache.org/jira/browse/KYLIN-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
]
liyang closed KYLIN-5700.
-------------------------
> Command line injection vulnerability when generating diagnostic packages via
> scripts
> ------------------------------------------------------------------------------------
>
> Key: KYLIN-5700
> URL: https://issues.apache.org/jira/browse/KYLIN-5700
> Project: Kylin
> Issue Type: Bug
> Components: Tools, Build and Test
> Affects Versions: 5.0-beta
> Reporter: Yaguang Jia
> Assignee: Yaguang Jia
> Priority: Critical
> Fix For: 5.0.0
>
>
> h2. Background
> In the current code, there are many scenarios where a cmd needs to be spliced
> and then executed by ProcessBuilder. The parameters of the spliced cmd may
> come from the interface, and there is a lack of parameter legitimacy
> checking, which may be vulnerable to malicious attacks.
> When splicing spark commands, the {{checkCommandInjection}} method is used to
> avoid injection attacks, but it only avoids injection attacks caused by
> backquotes and $(), such as {{{}`rm -rf /` $(rm -rf /){}}}, but not other
> scenarios, such as {{cat nohup.out2 && echo success || echo failed echo
> failed}}
> h2. Fix Design
> Parameter checking when splicing cmd commands, including the following four
> scenarios:
> 1. diagnostic package, it will splice the parameters of the diag.sh script,
> such as project, jobId, path, etc. It will check each parameter in turn, and
> if it matches{{ ^[a-zA-Z0-9_. /-]+$ }}is enough
> 2. When exporting influxDB data, it will splice the database address and
> database name as the parameter of influx command, the former meets{{
> [a-zA-Z0-9._-](:[0-9])?}} and {{^[0-9a-zA-Z_-]+$}} for the latter.
> 3. When fetching yarn's stats, the url of yarn is spliced as an argument to
> the curl command, conforming to{{ ^(http(s)? ://)? [a-zA-Z0-9._-](:[0-9])?
> (/[a-zA-Z0-9._-]+)*/? $}} That's it.
> 4. When executing the beeline command, the beeline-params in the
> configuration will be spliced into the command. The composition of the
> beeline-params is more complicated, forcing each parameter value to be
> converted to a string by wrapping it with ', such as {{abc → 'abc', ab'c →
> 'ab'\''c'}}
> h2. Background
> 在当前代码中,有众多场景需要拼接出一条 cmd, 然后通过 {{ProcessBuilder}}
> 来执行,拼接cmd的参数可能会来自接口,并且缺少参数合法性的检验,有被恶意攻击的可能。
> 当拼接 spark 命令时,使用了 {{checkCommandInjection}} 方法来避免注入攻击,但是该方法仅规避了 反引号 和 $()
> 导致的注入攻击,如 {{`rm -rf /`}} {{{}$(rm -rf /){}}},无法规避其他场景,如 {{cat nohup.out2 &&
> echo success || echo failed}}
> h2. Fix Design
> 在拼接cmd命令时对参数进行检查,包括以下四种场景:
> # 打诊断包时,会拼接 diag.sh 脚本的参数,如项目、jobId、路径等,依次检查每一个参数,符合 {{^[a-zA-Z0-9_./-]+$}}
> 即可
> # 导出influxDB 数据时,会在命令里拼接 *数据库地址* 以及 {*}数据库名称{*}作为 influx命令的参数,前者符合
> {{[a-zA-Z0-9._-]{+}(:[0-9]{+})?}} 即可,后者符合{{{}^[0-9a-zA-Z_-]+${}}} 即可
> # 获取yarn的统计指标时,会拼接yarn 的url地址作为 curl 命令的参数,符合
> {{^(http(s)?://)?[a-zA-Z0-9._-]{+}(:[0-9]{+})?(/[a-zA-Z0-9._-]+)*/?$}} 即可
> # 执行 beeline 命令时,会将配置中的 beeline-params
> 拼接到命令中,beeline-params的构成较为复杂,强制将每一个参数值使用{{{}'{}}}包起来转为字符串,如 abc → ‘abc',ab’c
> → ‘ab’\''c'
>
>
--
This message was sent by Atlassian Jira
(v8.20.10#820010)