[ https://issues.apache.org/jira/browse/TIKA-4465?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=18014098#comment-18014098 ]
Tim Allison commented on TIKA-4465: ----------------------------------- Thank you, Tilman. Great catch on "OnInstantiate". It also looks like there's StateEvent for 3D content during "Saving state data" and "Loading State data"? I can't tell if that's a pointer to an entry in the name tree's javascript entries or if that's actual javascript like "OnInstantiate". Let's put that in another ticket later if anyone wants it? For the requirement handler dictionary, it looks like there's an optional {{Script}} value that contains the name of the script (12.11.5 Table 276). That name points to an entry in the name tree's javascript entries...so I don't think the javascript is stored there, but I may be misreading the spec. Separately, it turns out that two of our existing unit test files contain javascript in the name trees: {{{}testPDFPackage.pdf{}}}, {{{}testPDF_XFA_govdocs1_258578.pdf{}}}. So we won't need to find unit test files. > Extract javascript from name dictionary in PDFs > ----------------------------------------------- > > Key: TIKA-4465 > URL: https://issues.apache.org/jira/browse/TIKA-4465 > Project: Tika > Issue Type: Task > Reporter: Tim Allison > Priority: Minor > > This blog > [https://labs.senhasegura.blog/unmasking-the-threat-a-deep-dive-into-the-pdf-malicious-2/] > mentions this malware file (be careful! dangerous!): > [https://bazaar.abuse.ch/download/4dc9b0c20ea61d91d6a1b5bdce76fb5365de0762efb8f6c2925113c6a8950cae/] > > > We're currently extracting javascript from actions, but not from the name > tree (document level-javascript). > > We should add this extraction if "extractActions" is set to true... or > better, come up with a better name for that variable in trunk. > > Related to this, I'd also like to extract javascript in TikaCLI by default as > we do for extracting inline images and incremental updates. -- This message was sent by Atlassian Jira (v8.20.10#820010)