Tika stand alone CLI --text output mostly not working, other output formats are fine ------------------------------------------------------------------------------------
Key: TIKA-179 URL: https://issues.apache.org/jira/browse/TIKA-179 Project: Tika Issue Type: Bug Components: cli Affects Versions: 0.2, 0.3 Environment: Java 1.5 (also tried Java 1.6). OS used: Mac OS X, Linux (CentOS) Reporter: Paul Borgermans When using Tika standalone jar after mvn install in CLI mode, in most of my test documents (pdf, doc, ppt, odt, ), the plain text output option (-t or --text) does not produce any result. When using the other options (xml, html, metadata), the output is correct. Activating debug mode (-v) does not produce additional info either. When using the GUI, dragging and dropping does produce the expected results, also in the plain text tab/window I rebuilt tika many times in the past 2 months (cleared .m2 directory every time) from svn (latest revision tried: 724002), the CLI --text result is always the same: usually missing output. For now, I use the -x output option chained to html2txt as a workaround, but would prefer to use just tika to convert to plain text (which is used for further indexing in Solr). Thanks -- This message is automatically generated by JIRA. - You can reply to this email to add a comment to the issue online.