Thanks for the script to install tesseract on CentOS.

I would suggest using traineddata files from tessdata_fast or tessdata_best
repos for better accuracy and speed.

On Mon 23 Apr, 2018, 11:52 PM Eugene Huang, <eugeneh...@gmail.com> wrote:

> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and
> Windows. Unfortunately, there are no clear instructions on installing
> Tesseract 4 for other flavors of Linux--probably most notably CentOS and
> Red Hat.
>
> After going through dependency hell, I successfully installed Tesseract 4
> onto CentOS 7. I presume that the installation script should also work for
> Red Hat. I want to give credit to EisenVault because this script is
> essentially a modified version of his script. This is my first contribution
> to open source software, so any tips will be highly appreciated!
>
> When running this script line by line, you probably have to prefix "sudo"
> to each line, or you can copy and paste into a bash script and then run
> sudo along with the script. I have tested both to work on a fresh image of
> CentOS 7 on VirtualBox.
>
> Cheers!
>
> # (Estimated Time of Completion: 45 minutes)
> # Instructions taken (and slightly modified) from
> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh
> cd /opt
> # The following line will take 30 minutes to install.
> yum -y update
> yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config
> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
> yum group install -y "Development Tools"
>
>
> # Install Leptonica from Source
> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz
> tar -zxvf leptonica-1.75.3.tar.gz
> cd leptonica-1.75.3
> ./autobuild
> ./configure
> make -j
> make install
> cd ..
> # Delete tar.gz file if you like
>
>
> # Sanity checks
> # check if libpng is installed: type "whereis libpng" and expect to see a
> directory; a blank line is not good
> # check if leptonica is installed: type "ls /usr/local/include" and expect
> to see "leptonica"
>
>
> # Install Tesseract from Source
> wget https://
> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz
> tar -zxvf 4.0.0-beta.1.tar.gz
> cd tesseract-4.0.0-beta.1/
> ./autogen.sh
> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig LIBLEPT_HEADERSDIR=/usr/local/include
> ./configure --with-extra-includes=/usr/local/include --with-extra-
> libraries=/usr/local/lib
> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
> make install
> ldconfig
> cd ..
> # Delete tar.gz file if you like
>
>
> # Download and install tesseract language files (Tesseract 4 traineddata
> files)
> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata
> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata
> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
> wget https://
> github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata
> # download another other languages you like
> mv *.traineddata /usr/local/share/tessdata
>
>
> # Sanity check
> # check if tesseract is installed: type "tesseract --version" and expect
> to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for
> images)
>
> --
> You received this message because you are subscribed to the Google Groups
> "tesseract-ocr" group.
> To unsubscribe from this group and stop receiving emails from it, send an
> email to tesseract-ocr+unsubscr...@googlegroups.com.
> To post to this group, send email to tesseract-ocr@googlegroups.com.
> Visit this group at https://groups.google.com/group/tesseract-ocr.
> To view this discussion on the web visit
> https://groups.google.com/d/msgid/tesseract-ocr/d41ebcc5-b3b1-4e66-af8a-c7896814a7cc%40googlegroups.com
> <https://groups.google.com/d/msgid/tesseract-ocr/d41ebcc5-b3b1-4e66-af8a-c7896814a7cc%40googlegroups.com?utm_medium=email&utm_source=footer>
> .
> For more options, visit https://groups.google.com/d/optout.
>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/CAG2NduUtn3-BLdzi-Sx2tKVpLyKWGXPZt6%2BvOVd1EJdP1K4SnA%40mail.gmail.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to