Sorry, its an error. I need the text content of PDF, txt and doc docx to index in solr.
Thanks for your help. De : msaunier [mailto:[email protected]] Envoyé : vendredi 5 janvier 2018 18:05 À : [email protected] Objet : OCR Tika to read PDF, txt and doc docx Hello, How can I used/install an OCR to extract the content_html in files with ManifoldCF ? I need the HTML content. Thanks for your help,
