[ https://issues.apache.org/jira/browse/TIKA-2069?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel&focusedCommentId=15534253#comment-15534253 ]
Jeff Swindle commented on TIKA-2069: ------------------------------------ [~talli...@apache.org] I tried a tika-app 1.14 snapshot and didn't get the expected output for the test-macro-doc.docm file. I also tried another internal file and didn't see macro output. Executing against xlsmacro.xlsm provided expected output of macro content. I've attached the output from tika-app against xlsmacro.xlsm and test-macro-doc.docm. Here are the commands I used: # java -jar tika-app-1.14-20160928.190000-109.jar test-macro-doc.docm > tika-app-1.14-20160928.190000-109-test-macro-doc.docm.output # java -jar tika-app-1.14-20160928.190000-109.jar xlsmacro.xlsm > tika-app-1.14-20160928.190000-109-xlsmacro.xlsm.output Is there something specific I need to add to the command to extract the macro in the docm? > Extract Macro text from Microsoft Office documents > -------------------------------------------------- > > Key: TIKA-2069 > URL: https://issues.apache.org/jira/browse/TIKA-2069 > Project: Tika > Issue Type: Improvement > Components: detector, parser > Affects Versions: 1.13 > Environment: RHEL 5.x, Apache Tomcat > Reporter: Jeff Swindle > Labels: features > Fix For: 2.0, 1.14 > > Attachments: excel-macro.PNG, test-macro-doc.docm, > test-macro-doc.docm-tika-app-output.txt, > tika-app-1.14-20160928.190000-109-test-macro-doc.docm.output, > tika-app-1.14-20160928.190000-109-xlsmacro.xlsm.output, word-macro.PNG, > xlsmacro.xlsm, xlsmacro.xlsm.tika-app-output.txt > > > Tika supports macro-enabled Microsoft Office documents by extracting metadata > and contents, however, macros within the document are not in the metadata or > content output. > Desire is to have the macro text extracted also. > Info regarding macro extraction: http://www.decalage.info/vba_tools -- This message was sent by Atlassian JIRA (v6.3.4#6332)