Great Blog post by Clinton Brownley today: <https://cbrownley.wordpress.com/2016/06/26/parsing-pdfs-in-python-with-tika/>
If you haven’t had a chance to check out tika-python [1], I recommend doing so! Would also appreciate any feedback or stars! Cheers, Chris [1] http://github.com/chrismattmann/tika-python/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Chris Mattmann, Ph.D. Chief Architect Instrument Software and Science Data Systems Section (398) NASA Jet Propulsion Laboratory Pasadena, CA 91109 USA Office: 168-519, Mailstop: 168-527 Email: [email protected] WWW: http://sunset.usc.edu/~mattmann/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++ Director, Information Retrieval and Data Science Group (IRDS) Adjunct Associate Professor, Computer Science Department University of Southern California, Los Angeles, CA 90089 USA WWW: http://irds.usc.edu/ ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++
