#34322: ManifestStaticFilesStorage crashes on commented JavaScript import
statements
-------------------------------------+-------------------------------------
     Reporter:  Adam Johnson         |                    Owner:  nobody
         Type:  Bug                  |                   Status:  closed
    Component:  contrib.staticfiles  |                  Version:  4.2
     Severity:  Normal               |               Resolution:  duplicate
     Keywords:                       |             Triage Stage:  Accepted
    Has patch:  0                    |      Needs documentation:  0
  Needs tests:  0                    |  Patch needs improvement:  0
Easy pickings:  0                    |                    UI/UX:  0
-------------------------------------+-------------------------------------

Comment (by Adam Johnson):

 I just attached the exploratory patch. I'll explain the status below, but
 first some fresh shower thoughts on impact.

 This bug probably affects many projects. Lots of projects use a bunch of
 JavaScript for their frontend, and the number of packages a typical JS
 project *probably* means at least one uses TypeScript import comments.
 It's also mostly outside of users' control - they need React, it pulls in
 50 other things, they can't easily remove the comments.

 Despite the impact, I think it's also *unlikely* to be detected by many
 people in Django's beta phase. The standard advice is to test with your
 project's test suite, which would does not include `collectstatic`. (In
 fact, I only "accidentally" ran collectstatic, it wasn't on my checklist.)

 Case in point: the CSS source map change to `HashedFilesMixin` in Django
 4.1 (#33353) broke Django Rest Framework, but no one found that until the
 final release. The fix was simple in that case: add the missing source map
 file to DRF  (https://github.com/encode/django-rest-framework/pull/8591).
 But in this case, there's no simple fix. Users are unlikely to be able to
 edit out every typescript comment from all their libraries.

 Okay, on the patch I started on... it's quite exploratory really. It tries
 to find all commented regions within the JS file, and skip processing
 import statements within those. But I realized I would *really* need to
 expand the approach to all kinds of string literals that might mention the
 word "import", and then I got to the point of realizing that regexes just
 won't cut it (not a huge surprise in retrospect).

 To avoid all false positives, we need to properly lex/tokenize the JS so
 we can partially-parse only real `import` statements. Luckily Django
 already has a JS lexer, but it would still be some work.

 The patch currently fails with JS regex literals that contain the word
 "import", or quote marks/backticks (because they match a string start):

 {{{
 const importRe = /import/;
 const doubleQuoteRe = /"/;
 }}}

 `django.utils.jslex` should be able to parse past those.

 I would suggest that we *stop* processing files with regexes, at least
 JavaScript files, and instead tokenize with `jslex`. Then we can process
 the rules individually.

 An optimization would be to skip tokenzing files that do not contain the
 words `sourceMappingURL`, `import`, or `export` (which a first-pass regex
 could detect). This would probably be required to avoid noticeable
 slowdown, since the tokenization process is almost certainly slower than
 the current regex approach.

 This change feels quite heavy though, I'm not sure it's something we'd
 want to do in a beta release?

 I was using a test project with an `app.js` containing:

 {{{
 // These should not be processed
 // @returns {import("./non-existent-1").something}
 /* @returns {import("./non-existent-2").something} */
 'import("./non-existent-3")'
 "import('./non-existent-4')"
 `import("./non-existent-5")`
 r = /import/;
 // This should be processed
 import example from "./module.js";
 }}}

 And an empty file `module.js` next to it.

-- 
Ticket URL: <https://code.djangoproject.com/ticket/34322#comment:17>
Django <https://code.djangoproject.com/>
The Web framework for perfectionists with deadlines.

-- 
You received this message because you are subscribed to the Google Groups 
"Django updates" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to [email protected].
To view this discussion on the web visit 
https://groups.google.com/d/msgid/django-updates/01070186d523ca3a-8abdfc57-512c-4903-94f0-c622e1ff713a-000000%40eu-central-1.amazonses.com.

Reply via email to