Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and Windows. Unfortunately, there are no clear instructions on installing Tesseract 4 for other flavors of Linux--probably most notably CentOS and Red Hat.
After going through dependency hell, I successfully installed Tesseract 4 onto CentOS 7. I presume that the installation script should also work for Red Hat. I want to give credit to EisenVault because this script is essentially a modified version of his script. This is my first contribution to open source software, so any tips will be highly appreciated! When running this script line by line, you probably have to prefix "sudo" to each line, or you can copy and paste into a bash script and then run sudo along with the script. I have tested both to work on a fresh image of CentOS 7 on VirtualBox. Cheers! # (Estimated Time of Completion: 45 minutes) # Instructions taken (and slightly modified) from https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh cd /opt # The following line will take 30 minutes to install. yum -y update yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel yum group install -y "Development Tools" # Install Leptonica from Source wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz tar -zxvf leptonica-1.75.3.tar.gz cd leptonica-1.75.3 ./autobuild ./configure make -j make install cd .. # Delete tar.gz file if you like # Sanity checks # check if libpng is installed: type "whereis libpng" and expect to see a directory; a blank line is not good # check if leptonica is installed: type "ls /usr/local/include" and expect to see "leptonica" # Install Tesseract from Source wget https://github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz tar -zxvf 4.0.0-beta.1.tar.gz cd tesseract-4.0.0-beta.1/ ./autogen.sh PKG_CONFIG_PATH=/usr/local/lib/pkgconfig LIBLEPT_HEADERSDIR=/usr/local/include ./configure --with-extra-includes=/usr/local/include --with-extra-libraries= /usr/local/lib LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j make install ldconfig cd .. # Delete tar.gz file if you like # Download and install tesseract language files (Tesseract 4 traineddata files) wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata wget https: //github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata # download another other languages you like mv *.traineddata /usr/local/share/tessdata # Sanity check # check if tesseract is installed: type "tesseract --version" and expect to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for images) -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d41ebcc5-b3b1-4e66-af8a-c7896814a7cc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.