[tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]
Hello Vinsec! Sorry for the slow reply. Work is keeping me busy. No mystery there. For me, I have been put on another project, so I haven't been using Tesseract for awhile. I hope somebody more experienced like Shree or Александр Поздняков can give you the right answer. Good luck! Eugene On Sunday, February 17, 2019 at 6:59:27 AM UTC-5, Vinsec wrote: > > LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j > make[2]: Leaving directory `/root/tesseract-4.0.0/src/classify' > make[1]: *** [all-recursive] Error 1 > make[1]: Leaving directory `/root/tesseract-4.0.0' > make: *** [all] Error 2 > > I'm using CentOS 7 and installing tesseract by your scripts.But failed > when run the above code. > I would appreciate it if you could give me some advice:) > > > LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j > > 在 2018年4月24日星期二 UTC+8上午2:22:40,Eugene Huang写道: >> >> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and >> Windows. Unfortunately, there are no clear instructions on installing >> Tesseract 4 for other flavors of Linux--probably most notably CentOS and >> Red Hat. >> >> After going through dependency hell, I successfully installed Tesseract 4 >> onto CentOS 7. I presume that the installation script should also work for >> Red Hat. I want to give credit to EisenVault because this script is >> essentially a modified version of his script. This is my first contribution >> to open source software, so any tips will be highly appreciated! >> >> When running this script line by line, you probably have to prefix "sudo" >> to each line, or you can copy and paste into a bash script and then run >> sudo along with the script. I have tested both to work on a fresh image of >> CentOS 7 on VirtualBox. >> >> Cheers! >> >> # (Estimated Time of Completion: 45 minutes) >> # Instructions taken (and slightly modified) from >> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh >> cd /opt >> # The following line will take 30 minutes to install. >> yum -y update >> yum -y install libstdc++ autoconf automake libtool autoconf-archive >> pkg-config >> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel >> yum group install -y "Development Tools" >> >> >> # Install Leptonica from Source >> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz >> tar -zxvf leptonica-1.75.3.tar.gz >> cd leptonica-1.75.3 >> ./autobuild >> ./configure >> make -j >> make install >> cd .. >> # Delete tar.gz file if you like >> >> >> # Sanity checks >> # check if libpng is installed: type "whereis libpng" and expect to see a >> directory; a blank line is not good >> # check if leptonica is installed: type "ls /usr/local/include" and >> expect to see "leptonica" >> >> >> # Install Tesseract from Source >> wget https:// >> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz >> tar -zxvf 4.0.0-beta.1.tar.gz >> cd tesseract-4.0.0-beta.1/ >> ./autogen.sh >> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig >> LIBLEPT_HEADERSDIR=/usr/local/include >> ./configure --with-extra-includes=/usr/local/include --with-extra- >> libraries=/usr/local/lib >> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j >> make install >> ldconfig >> cd .. >> # Delete tar.gz file if you like >> >> >> # Download and install tesseract language files (Tesseract 4 traineddata >> files) >> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata >> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata >> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata >> wget https:// >> github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata >> # download another other languages you like >> mv *.traineddata /usr/local/share/tessdata >> >> >> # Sanity check >> # check if tesseract is installed: type "tesseract --version" and expect >> to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for >> images) >> >> -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/92bb29d0-5eba-49b4-928b-1b7daf1024b5%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]
Hello Periasamy! Do you know what version of CentOS you're using? I used CentOS 7, and I haven't tried this installation script on other versions of CentOS. I never got the libtool error, so I'm sorry that I don't know what the exact solution is. If you are using CentOS 7, perhaps there have been updates to packages (like libtool) that make installation a little trickier. Good luck! If you discover the solution, feel free to post it here. Eugene On Thu, Sep 6, 2018 at 11:54 PM Periasamy Kanagavel < periasamy.kanaga...@gmail.com> wrote: > I am new to Cent OS. I am trying the steps mentioned here. Upto the step " > PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ...", there were no issues. > While running the command "LDFLAGS="-L/usr/local/lib" CFLAGS= > "-I/usr/local/include" make -j", I was getting "libtool: Version mismatch > error. This is libtool 2.4.6, but the" error. Am I missing anything? > > On Monday, April 23, 2018 at 11:52:40 PM UTC+5:30, Eugene Huang wrote: >> >> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and >> Windows. Unfortunately, there are no clear instructions on installing >> Tesseract 4 for other flavors of Linux--probably most notably CentOS and >> Red Hat. >> >> After going through dependency hell, I successfully installed Tesseract 4 >> onto CentOS 7. I presume that the installation script should also work for >> Red Hat. I want to give credit to EisenVault because this script is >> essentially a modified version of his script. This is my first contribution >> to open source software, so any tips will be highly appreciated! >> >> When running this script line by line, you probably have to prefix "sudo" >> to each line, or you can copy and paste into a bash script and then run >> sudo along with the script. I have tested both to work on a fresh image of >> CentOS 7 on VirtualBox. >> >> Cheers! >> >> # (Estimated Time of Completion: 45 minutes) >> # Instructions taken (and slightly modified) from >> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh >> cd /opt >> # The following line will take 30 minutes to install. >> yum -y update >> yum -y install libstdc++ autoconf automake libtool autoconf-archive >> pkg-config >> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel >> yum group install -y "Development Tools" >> >> >> # Install Leptonica from Source >> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz >> tar -zxvf leptonica-1.75.3.tar.gz >> cd leptonica-1.75.3 >> ./autobuild >> ./configure >> make -j >> make install >> cd .. >> # Delete tar.gz file if you like >> >> >> # Sanity checks >> # check if libpng is installed: type "whereis libpng" and expect to see a >> directory; a blank line is not good >> # check if leptonica is installed: type "ls /usr/local/include" and >> expect to see "leptonica" >> >> >> # Install Tesseract from Source >> wget https:// >> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz >> tar -zxvf 4.0.0-beta.1.tar.gz >> cd tesseract-4.0.0-beta.1/ >> ./autogen.sh >> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig >> LIBLEPT_HEADERSDIR=/usr/local/include >> ./configure --with-extra-includes=/usr/local/include --with-extra- >> libraries=/usr/local/lib >> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j >> make install >> ldconfig >> cd .. >> # Delete tar.gz file if you like >> >> >> # Download and install tesseract language files (Tesseract 4 traineddata >> files) >> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata >> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata >> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata >> wget https:// >> github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata >> # download another other languages you like >> mv *.traineddata /usr/local/share/tessdata >> >> >> # Sanity check >> # check if tesseract is installed: type "tesseract --version" and expect >> to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for >> images) >> >> -- > You received this message because you are subscribed to a topic in the > Google Groups "tesseract-ocr" group. > To unsubscribe from this topic, visit > https://groups.google.com/d/topic/tesseract-ocr/u-PZaakaKs0
[tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]
Hello Александр! I took a look at your stuff; it is very extensive. If all the installations work, this should be front-paged! I have never used openSUSE. Could you point me to some resources to figure out how use your installation packages? @shree Thanks for the info. I definitely notice that Tesseract 4 is more accurate--more example, Tesseract 4 can read small italics font whereas Tesseract 3 makes lots of mistakes. Seems like Tesseract 4 is the future! -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/099cf0e9-157f-4f70-8774-0a0f928a3054%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [tesseract-ocr] Install Tesseract 4 on CentOS and Red Hat [SOLVED!]
@Shree Thanks for the tip. Just 2 quick questions. 1) From https://github.com/tesseract-ocr/tesseract/wiki/Data-Files, it says that "osd" and "equ" traineddata files are compatible between Tesseract 3 and 4. In the GitHub tessdata_fast repo (https://github.com/tesseract-ocr/tessdata_fast), "osd" is there with the commit "Use legacy Orientation Script Detector (OSD) because that is the only thing that currently works." However, "equ" is not in the repo. Was this simply a small mistake where the maintainer forgot to include the "equ" data file? 2) Also, with tessdata_fast, I was able to get Tesseract 4 running faster than using Tesseract 4 with tessdata. However, is Tesseract 4 supposed to be slower than Tesseract 3 because that's what I'm experiencing? # Here are the updated instructions to download tessdata_fast, which I tested to indeed perform faster than tessdata. # However, when calling Tesseract from the command line, using the arguments "--oem 2" will no longer work. # Use "--oem 1" since only the neural net LSTM model exists if using tessdata_fast. wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/osd.traineddata?raw=true wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/eng.traineddata?raw=true wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/chi_sim.traineddata?raw=true -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/47f3b497-84fb-4aed-9766-877053e8a293%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.
Re: [tesseract-ocr] Install Tesseract 4 on CentOS and Red Hat [SOLVED!]
@Shree Thanks for the tip. Just 2 quick questions. 1) From https://github.com/tesseract-ocr/tesseract/wiki/Data-Files, it says that "osd" and "equ" traineddata files are compatible between Tesseract 3 and 4. In the GitHub tessdata_fast repo (https://github.com/tesseract-ocr/tessdata_fast), "osd" is there with the commit "Use legacy Orientation Script Detector (OSD) because that is the only thing that currently works." However, "equ" is not in the repo. Was this simply a small mistake where the maintainer forgot to include the "equ" data file? 2) Also, with tessdata_fast, I was able to get Tesseract 4 running faster than using Tesseract 4 with tessdata. However, is Tesseract 4 supposed to be slower than Tesseract 3 because that's what I'm experiencing? # Here are the updated instructions to download tessdata_fast, which I tested to indeed perform faster than tessdata. # However, when calling Tesseract from the command line, using the arguments "--oem 2" will no longer work. # Use "--oem 1" since only the neural net LSTM model exists if using tessdata_fast. wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/osd.traineddata?raw=true wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/eng.traineddata?raw=true wget https://github.com/tesseract-ocr/tessdata_fast/blob/master/chi_sim.traineddata?raw=true On Monday, April 23, 2018 at 2:37:09 PM UTC-4, shree wrote: > > Thanks for the script to install tesseract on CentOS. > > I would suggest using traineddata files from tessdata_fast or > tessdata_best repos for better accuracy and speed. > > On Mon 23 Apr, 2018, 11:52 PM Eugene Huang, <eugen...@gmail.com > > wrote: > >> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and >> Windows. Unfortunately, there are no clear instructions on installing >> Tesseract 4 for other flavors of Linux--probably most notably CentOS and >> Red Hat. >> >> After going through dependency hell, I successfully installed Tesseract 4 >> onto CentOS 7. I presume that the installation script should also work for >> Red Hat. I want to give credit to EisenVault because this script is >> essentially a modified version of his script. This is my first contribution >> to open source software, so any tips will be highly appreciated! >> >> When running this script line by line, you probably have to prefix "sudo" >> to each line, or you can copy and paste into a bash script and then run >> sudo along with the script. I have tested both to work on a fresh image of >> CentOS 7 on VirtualBox. >> >> Cheers! >> >> # (Estimated Time of Completion: 45 minutes) >> # Instructions taken (and slightly modified) from >> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh >> cd /opt >> # The following line will take 30 minutes to install. >> yum -y update >> yum -y install libstdc++ autoconf automake libtool autoconf-archive >> pkg-config >> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel >> yum group install -y "Development Tools" >> >> >> # Install Leptonica from Source >> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz >> tar -zxvf leptonica-1.75.3.tar.gz >> cd leptonica-1.75.3 >> ./autobuild >> ./configure >> make -j >> make install >> cd .. >> # Delete tar.gz file if you like >> >> >> # Sanity checks >> # check if libpng is installed: type "whereis libpng" and expect to see a >> directory; a blank line is not good >> # check if leptonica is installed: type "ls /usr/local/include" and >> expect to see "leptonica" >> >> >> # Install Tesseract from Source >> wget https:// >> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz >> tar -zxvf 4.0.0-beta.1.tar.gz >> cd tesseract-4.0.0-beta.1/ >> ./autogen.sh >> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig >> LIBLEPT_HEADERSDIR=/usr/local/include >> ./configure --with-extra-includes=/usr/local/include --with-extra- >> libraries=/usr/local/lib >> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j >> make install >> ldconfig >> cd .. >> # Delete tar.gz file if you like >> >> >> # Download and install tesseract language files (Tesseract 4 traineddata >> files) >> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata >> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata >> wget https://github.
[tesseract-ocr] Install Tesseract 4 on CentOS and Red Hat [SOLVED!]
Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and Windows. Unfortunately, there are no clear instructions on installing Tesseract 4 for other flavors of Linux--probably most notably CentOS and Red Hat. After going through dependency hell, I successfully installed Tesseract 4 onto CentOS 7. I presume that the installation script should also work for Red Hat. I want to give credit to EisenVault because this script is essentially a modified version of his script. This is my first contribution to open source software, so any tips will be highly appreciated! When running this script line by line, you probably have to prefix "sudo" to each line, or you can copy and paste into a bash script and then run sudo along with the script. I have tested both to work on a fresh image of CentOS 7 on VirtualBox. Cheers! # (Estimated Time of Completion: 45 minutes) # Instructions taken (and slightly modified) from https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh cd /opt # The following line will take 30 minutes to install. yum -y update yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel yum group install -y "Development Tools" # Install Leptonica from Source wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz tar -zxvf leptonica-1.75.3.tar.gz cd leptonica-1.75.3 ./autobuild ./configure make -j make install cd .. # Delete tar.gz file if you like # Sanity checks # check if libpng is installed: type "whereis libpng" and expect to see a directory; a blank line is not good # check if leptonica is installed: type "ls /usr/local/include" and expect to see "leptonica" # Install Tesseract from Source wget https://github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz tar -zxvf 4.0.0-beta.1.tar.gz cd tesseract-4.0.0-beta.1/ ./autogen.sh PKG_CONFIG_PATH=/usr/local/lib/pkgconfig LIBLEPT_HEADERSDIR=/usr/local/include ./configure --with-extra-includes=/usr/local/include --with-extra-libraries= /usr/local/lib LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j make install ldconfig cd .. # Delete tar.gz file if you like # Download and install tesseract language files (Tesseract 4 traineddata files) wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata wget https: //github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata # download another other languages you like mv *.traineddata /usr/local/share/tessdata # Sanity check # check if tesseract is installed: type "tesseract --version" and expect to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for images) -- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/d41ebcc5-b3b1-4e66-af8a-c7896814a7cc%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.