Hi, For the first time today I have a use case of the office-scraper plugin [0]. The command line tools come in pretty handy here and I made the following observation. If you are working with xsl (older formats) or xlsx (newer 2007-2010) formats they need to be ***originally*** written in Microsoft Excel. I can only assume that this is because the mimetype MD is written and maintained based on the original editor. For example I created two excel documents on Libra Office (ouch) as I am using Ubuntu... I save tho my desktop and use
law@CEE279Law3-Linux:~/Desktop$ any23 mimes file:///home/law/spec_table.xls Display all 190 possibilities? (y or n) Linux:~/Desktop$ any23 mimes file:///home/law/Desktop/spec_table.xls ------------------------------------------------------------------------ Apache Any23 :: mimes ------------------------------------------------------------------------ application/x-tika-msoffice ------------------------------------------------------------------------ Apache Any23 SUCCESS Total time: 0s Finished at: Tue Jul 02 12:37:20 PDT 2013 Final Memory: 25M/479M ------------------------------------------------------------------------ Linux:~/Desktop$ any23 mimes file:///home/law/Desktop/spec_table.xlsx ------------------------------------------------------------------------ Apache Any23 :: mimes ------------------------------------------------------------------------ application/x-tika-ooxml ------------------------------------------------------------------------ Apache Any23 SUCCESS Total time: 0s Finished at: Tue Jul 02 12:37:29 PDT 2013 Final Memory: 25M/479M ------------------------------------------------------------------------ When I do Linux:~/Desktop$ any23 verify ~/.any23/plugins ------------------------------------------------------------------------ Apache Any23 :: verify ------------------------------------------------------------------------ Plugin author : <unknown> Plugin factory : class org.apache.any23.plugin.officescraper.ExcelExtractorFactory Plugin mime-types: application/vnd.ms-excel;q=0.1 application/msexcel;q=0.1 application/x-msexcel;q=0.1 application/x-ms-excel;q=0.1 ------------------------------------------------------------------------ The plugin will ***only*** work with document formats application/vnd.ms-excel;q=0.1 application/msexcel;q=0.1 application/x-msexcel;q=0.1 application/x-ms-excel;q=0.1 So I am running between the library and my office punching in trivial spreadsheets to achieve what I want to do... the joys. Thanks Lewis [0] *http://s.apache.org/UaG* -- *Lewis*
