[ 
https://issues.apache.org/jira/browse/KYLIN-5700?page=com.atlassian.jira.plugin.system.issuetabpanels:all-tabpanel
 ]

Yaguang Jia resolved KYLIN-5700.
--------------------------------
    Resolution: Fixed

> Command line injection vulnerability when generating diagnostic packages via 
> scripts
> ------------------------------------------------------------------------------------
>
>                 Key: KYLIN-5700
>                 URL: https://issues.apache.org/jira/browse/KYLIN-5700
>             Project: Kylin
>          Issue Type: Bug
>          Components: Tools, Build and Test
>    Affects Versions: 5.0-beta
>            Reporter: Yaguang Jia
>            Assignee: Yaguang Jia
>            Priority: Critical
>             Fix For: 5.0.0
>
>
> h2. Background
> In the current code, there are many scenarios where a cmd needs to be spliced 
> and then executed by ProcessBuilder. The parameters of the spliced cmd may 
> come from the interface, and there is a lack of parameter legitimacy 
> checking, which may be vulnerable to malicious attacks.
> When splicing spark commands, the {{checkCommandInjection}} method is used to 
> avoid injection attacks, but it only avoids injection attacks caused by 
> backquotes and $(), such as {{{}`rm -rf /` $(rm -rf /){}}}, but not other 
> scenarios, such as {{cat nohup.out2 && echo success || echo failed echo 
> failed}}
> h2. Fix Design
> Parameter checking when splicing cmd commands, including the following four 
> scenarios:
> 1. diagnostic package, it will splice the parameters of the diag.sh script, 
> such as project, jobId, path, etc. It will check each parameter in turn, and 
> if it matches{{ ^[a-zA-Z0-9_. /-]+$ }}is enough
> 2. When exporting influxDB data, it will splice the database address and 
> database name as the parameter of influx command, the former meets{{ 
> [a-zA-Z0-9._-](:[0-9])?}} and {{^[0-9a-zA-Z_-]+$}} for the latter.
> 3. When fetching yarn's stats, the url of yarn is spliced as an argument to 
> the curl command, conforming to{{ ^(http(s)? ://)? [a-zA-Z0-9._-](:[0-9])? 
> (/[a-zA-Z0-9._-]+)*/? $}} That's it.
> 4. When executing the beeline command, the beeline-params in the 
> configuration will be spliced into the command. The composition of the 
> beeline-params is more complicated, forcing each parameter value to be 
> converted to a string by wrapping it with ', such as {{abc → 'abc', ab'c → 
> 'ab'\''c'}}
> h2. Background
> 在当前代码中,有众多场景需要拼接出一条 cmd, 然后通过 {{ProcessBuilder}} 
> 来执行,拼接cmd的参数可能会来自接口,并且缺少参数合法性的检验,有被恶意攻击的可能。
> 当拼接 spark 命令时,使用了 {{checkCommandInjection}} 方法来避免注入攻击,但是该方法仅规避了 反引号 和 $() 
> 导致的注入攻击,如 {{`rm -rf /`}} {{{}$(rm -rf /){}}},无法规避其他场景,如 {{cat nohup.out2 && 
> echo success || echo failed}}
> h2. Fix Design
> 在拼接cmd命令时对参数进行检查,包括以下四种场景:
>  # 打诊断包时,会拼接 diag.sh 脚本的参数,如项目、jobId、路径等,依次检查每一个参数,符合 {{^[a-zA-Z0-9_./-]+$}} 
> 即可
>  # 导出influxDB 数据时,会在命令里拼接 *数据库地址* 以及 {*}数据库名称{*}作为 influx命令的参数,前者符合 
> {{[a-zA-Z0-9._-]{+}(:[0-9]{+})?}} 即可,后者符合{{{}^[0-9a-zA-Z_-]+${}}} 即可
>  # 获取yarn的统计指标时,会拼接yarn 的url地址作为 curl 命令的参数,符合 
> {{^(http(s)?://)?[a-zA-Z0-9._-]{+}(:[0-9]{+})?(/[a-zA-Z0-9._-]+)*/?$}} 即可
>  # 执行 beeline 命令时,会将配置中的 beeline-params 
> 拼接到命令中,beeline-params的构成较为复杂,强制将每一个参数值使用{{{}'{}}}包起来转为字符串,如 abc → ‘abc',ab’c 
> → ‘ab’\''c'
>  
>  



--
This message was sent by Atlassian Jira
(v8.20.10#820010)

Reply via email to