Dear Wiki user, You have subscribed to a wiki page or wiki category on "Tika Wiki" for change notification.
The "VirtualMachine" page has been changed by TimothyAllison: https://wiki.apache.org/tika/VirtualMachine?action=diff&rev1=26&rev2=27 ''NOTE:'' We found that ''pdftotext'' was not correctly reading the ''xpdfrc'' file in this location. We found no differences in extracted text when we removed the ''xpdfrc'' file and when we had it there. We did find a difference, especially in CJK PDFs, when we specified the ''xpdfrc'' file from the commandline with the ''-cfg'' option. + == ffmpeg == + + 1. `sudo yum install epel-release` + + 2. `sudo yum localinstall --nogpgcheck https://download1.rpmfusion.org/free/el/rpmfusion-free-release-7.noarch.rpm https://download1.rpmfusion.org/nonfree/el/rpmfusion-nonfree-release-7.noarch.rpm` + + 3. `sudo yum install ffmpeg ffmpeg-devel` == Other data == See ApacheTikaHtmlEncodingStudy for a description of gathering data for TIKA-2038.
