ZuhairORZaki opened a new pull request, #36469: URL: https://github.com/apache/airflow/pull/36469
While triaging your project, our bug fixing tool generated the following message- In file: `views.py`, there is a method that is vulnerable to XSS which can compromise any cookies, session tokens and other sensitive information used with the website and browser. [views.py](https://github.com/apache/airflow/blob/main/airflow/www/views.py#L2633) ```python @expose(\"/confirm\", methods=[\"GET\"]) @auth.has_access_dag(\"PUT\", DagAccessEntity.TASK_INSTANCE) @action_logging def confirm(self): \"\"\"Show confirmation page for marking tasks as success or failed.\"\"\" args = request.args ... dag_id = args.get(\"dag_id\") ... dag = get_airflow_app().dag_bag.get_dag(dag_id) if not dag: msg = f\"DAG {dag_id} not found\" return redirect_or_json(origin, msg, status=\"error\", status_code=404) ``` Here request handler method `confirm` reads the value of request parameter `dag_id`. If no associated `dag` object is found then it cobstructs a string `msg` with it and passes it to `redirect_or_json` method which sends back a response containing that `msg`. Since flask automatically url decodes request parameters `dag_id` can be assigned javascript code which will be sent back as part of the response. But the response content-type is application/json, so injected payload remains as a JSON data value in a JSON name/value pair and doesn't execute. So this isn't exactly an XSS vulnerability. Still if a client-side script receives this JSON value and uses some unsafe DOM update method e.g. `setInnerHTML`, `document.write` etc. then this could be a case of client-side XSS vulnerability. ### Exploitable Scenario Consider the following simple client side page. ```html <!DOCTYPE html> <html lang=\"en\"> <head> <meta charset=\"UTF-8\"> <meta name=\"viewport\" content=\"width=device-width, initial-scale=1.0\"> <title>Fetch and Display Data</title> </head> <body> <div id=\"detail\"></div> <script> document.addEventListener('DOMContentLoaded', function() { var pos=document.URL.indexOf(\"dag_id=\")+7; var dagId = decodeURIComponent(document.URL.substring(pos)); fetch(`https://example.airflow.com/confirm?dag_id=${dagId}`) .then(response => response.text()) .then(data => { document.getElementById(\"detail\").innerHTML = data; }) .catch(error => console.error('Error:', error)); }); </script> </body> </html> ``` Let's assume the page invoked with an URL such as ``` http://www.some.site/confirm_page.html?dag_id=123 ``` The script wihtin the page reads the `dag_id` parameter value from the url and sends a GET request to the above server with that value. The server then sends a Not Found response message if no associated dag object is found with the `dag_id` value included. Now an XSS attack can be carried out by sending an URL like below to a victim. ``` http://www.some.site/confirm_page.html?dag_id=%3Cimg%20src%3D%27x%27%20onerror%3D%27alert%281%29%27%3E ``` The payload here is a URL-encoded form of the following: ``` <img src='x' onerror='alert(1)'> ``` Since the client-side code here uses an unsafe DOM update method `(element).innerHTML` the injected payload will execute in the victim's browser. ### Suggested Fix One possible fix is sanitizing user provided data `dag_id` by defining a regex pattern that represents all inappropriate characters like the following: ```python pattern = re.compile(r'[^a-zA-Z0-9_\-]') ``` Then any character that matches the above pattern can be replaced with an empty string using following statement: ```python dag_id = re.sub(pattern, '', dag_id) ``` Now if an attacker supplies javascript code as value for `dag_id` e.g. `<script>alert(1)</script>`, it will just become `scriptalert1script`. ### CLA Requirements: This section is only relevant if your project requires contributors to sign a Contributor License Agreement (CLA) for external contributions. All contributed commits are already automatically signed off. The meaning of a signoff depends on the project, but it typically certifies that committer has the rights to submit this work under the same license and agrees to a Developer Certificate of Origin (see [https://developercertificate.org/](https://developercertificate.org/) for more information). - [Git Commit SignOff documentation](https://developercertificate.org/) ### Sponsorship and Support: This work is done by the security researchers from OpenRefactory and is supported by the [Open Source Security Foundation (OpenSSF)](https://openssf.org/): [Project Alpha-Omega](https://alpha-omega.dev/). Alpha-Omega is a project partnering with open source software project maintainers to systematically find new, as-yet-undiscovered vulnerabilities in open source code - and get them fixed - to improve global software supply chain security. The bug is found by running the iCR tool by [OpenRefactory, Inc.](https://openrefactory.com/) and then manually triaging the results. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: [email protected] For queries about this service, please contact Infrastructure at: [email protected]
