Copilot commented on code in PR #39294:
URL: https://github.com/apache/superset/pull/39294#discussion_r3190456172


##########
superset/utils/file.py:
##########
@@ -14,10 +14,23 @@
 # KIND, either express or implied.  See the License for the
 # specific language governing permissions and limitations
 # under the License.
+import re
+
 from werkzeug.utils import secure_filename
 
+# All C0 (U+0000–U+001F) and C1 (U+007F–U+009F) control characters.
+# Stripping every control char (including tab, LF, CR) keeps titles safe for
+# SMTP headers, Content-Disposition filenames, and headless-browser 
document.title.
+_CONTROL_CHARS_RE = re.compile(r"[\x00-\x1f\x7f-\x9f]")

Review Comment:
   The PR description says the backend sanitizer preserves tab/LF/CR, but the 
actual regex strips all C0 controls (including \\x09/\\x0A/\\x0D), and both 
Python/TS tests assert tab/LF/CR are removed. Please update the PR description 
(and any related docs/comments outside this diff) to match the implemented 
behavior, or adjust the regex/tests if preserving tab/LF/CR is the intended 
requirement.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to