#34322: ManifestStaticFilesStorage crashes on commented JavaScript import
statements
-------------------------------------+-------------------------------------
Reporter: Adam Johnson | Owner: nobody
Type: Bug | Status: closed
Component: contrib.staticfiles | Version: 4.2
Severity: Normal | Resolution: duplicate
Keywords: | Triage Stage: Accepted
Has patch: 0 | Needs documentation: 0
Needs tests: 0 | Patch needs improvement: 0
Easy pickings: 0 | UI/UX: 0
-------------------------------------+-------------------------------------
Comment (by Adam Johnson):
I just attached the exploratory patch. I'll explain the status below, but
first some fresh shower thoughts on impact.
This bug probably affects many projects. Lots of projects use a bunch of
JavaScript for their frontend, and the number of packages a typical JS
project *probably* means at least one uses TypeScript import comments.
It's also mostly outside of users' control - they need React, it pulls in
50 other things, they can't easily remove the comments.
Despite the impact, I think it's also *unlikely* to be detected by many
people in Django's beta phase. The standard advice is to test with your
project's test suite, which would does not include `collectstatic`. (In
fact, I only "accidentally" ran collectstatic, it wasn't on my checklist.)
Case in point: the CSS source map change to `HashedFilesMixin` in Django
4.1 (#33353) broke Django Rest Framework, but no one found that until the
final release. The fix was simple in that case: add the missing source map
file to DRF (https://github.com/encode/django-rest-framework/pull/8591).
But in this case, there's no simple fix. Users are unlikely to be able to
edit out every typescript comment from all their libraries.
Okay, on the patch I started on... it's quite exploratory really. It tries
to find all commented regions within the JS file, and skip processing
import statements within those. But I realized I would *really* need to
expand the approach to all kinds of string literals that might mention the
word "import", and then I got to the point of realizing that regexes just
won't cut it (not a huge surprise in retrospect).
To avoid all false positives, we need to properly lex/tokenize the JS so
we can partially-parse only real `import` statements. Luckily Django
already has a JS lexer, but it would still be some work.
The patch currently fails with JS regex literals that contain the word
"import", or quote marks/backticks (because they match a string start):
{{{
const importRe = /import/;
const doubleQuoteRe = /"/;
}}}
`django.utils.jslex` should be able to parse past those.
I would suggest that we *stop* processing files with regexes, at least
JavaScript files, and instead tokenize with `jslex`. Then we can process
the rules individually.
An optimization would be to skip tokenzing files that do not contain the
words `sourceMappingURL`, `import`, or `export` (which a first-pass regex
could detect). This would probably be required to avoid noticeable
slowdown, since the tokenization process is almost certainly slower than
the current regex approach.
This change feels quite heavy though, I'm not sure it's something we'd
want to do in a beta release?
I was using a test project with an `app.js` containing:
{{{
// These should not be processed
// @returns {import("./non-existent-1").something}
/* @returns {import("./non-existent-2").something} */
'import("./non-existent-3")'
"import('./non-existent-4')"
`import("./non-existent-5")`
r = /import/;
// This should be processed
import example from "./module.js";
}}}
And an empty file `module.js` next to it.
--
Ticket URL: <https://code.djangoproject.com/ticket/34322#comment:17>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.
--
You received this message because you are subscribed to the Google Groups
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/django-updates/01070186d523ca3a-8abdfc57-512c-4903-94f0-c622e1ff713a-000000%40eu-central-1.amazonses.com.