[tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2019-03-05 Thread Eugene Huang
Hello Vinsec!

Sorry for the slow reply. Work is keeping me busy. No mystery there.
For me, I have been put on another project, so I haven't been using 
Tesseract for awhile. I hope somebody more experienced like Shree 
or Александр Поздняков can give you the right answer.

Good luck!
Eugene

On Sunday, February 17, 2019 at 6:59:27 AM UTC-5, Vinsec wrote:
>
> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
> make[2]: Leaving directory `/root/tesseract-4.0.0/src/classify'
> make[1]: *** [all-recursive] Error 1
> make[1]: Leaving directory `/root/tesseract-4.0.0'
> make: *** [all] Error 2
>
> I'm using CentOS 7 and installing tesseract by your scripts.But failed 
> when run the above code.
> I would appreciate it if you could give me some advice:)
>
>
> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
>
> 在 2018年4月24日星期二 UTC+8上午2:22:40,Eugene Huang写道:
>>
>> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and 
>> Windows. Unfortunately, there are no clear instructions on installing 
>> Tesseract 4 for other flavors of Linux--probably most notably CentOS and 
>> Red Hat.
>>
>> After going through dependency hell, I successfully installed Tesseract 4 
>> onto CentOS 7. I presume that the installation script should also work for 
>> Red Hat. I want to give credit to EisenVault because this script is 
>> essentially a modified version of his script. This is my first contribution 
>> to open source software, so any tips will be highly appreciated!
>>
>> When running this script line by line, you probably have to prefix "sudo" 
>> to each line, or you can copy and paste into a bash script and then run 
>> sudo along with the script. I have tested both to work on a fresh image of 
>> CentOS 7 on VirtualBox.
>>
>> Cheers!
>>
>> # (Estimated Time of Completion: 45 minutes)
>> # Instructions taken (and slightly modified) from 
>> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh
>> cd /opt
>> # The following line will take 30 minutes to install.
>> yum -y update 
>> yum -y install libstdc++ autoconf automake libtool autoconf-archive 
>> pkg-config 
>> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
>> yum group install -y "Development Tools"
>>
>>
>> # Install Leptonica from Source
>> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz
>> tar -zxvf leptonica-1.75.3.tar.gz
>> cd leptonica-1.75.3
>> ./autobuild
>> ./configure
>> make -j
>> make install
>> cd ..
>> # Delete tar.gz file if you like
>>
>>
>> # Sanity checks
>> # check if libpng is installed: type "whereis libpng" and expect to see a 
>> directory; a blank line is not good
>> # check if leptonica is installed: type "ls /usr/local/include" and 
>> expect to see "leptonica"
>>
>>
>> # Install Tesseract from Source
>> wget https://
>> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz
>> tar -zxvf 4.0.0-beta.1.tar.gz
>> cd tesseract-4.0.0-beta.1/
>> ./autogen.sh
>> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig 
>> LIBLEPT_HEADERSDIR=/usr/local/include 
>> ./configure --with-extra-includes=/usr/local/include --with-extra-
>> libraries=/usr/local/lib
>> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
>> make install
>> ldconfig
>> cd ..
>> # Delete tar.gz file if you like
>>
>>
>> # Download and install tesseract language files (Tesseract 4 traineddata 
>> files)
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
>> wget https://
>> github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata
>> # download another other languages you like
>> mv *.traineddata /usr/local/share/tessdata
>>
>>
>> # Sanity check
>> # check if tesseract is installed: type "tesseract --version" and expect 
>> to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for 
>> images)
>>
>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/92bb29d0-5eba-49b4-928b-1b7daf1024b5%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2018-09-09 Thread Eugene Huang
Hello Periasamy!

Do you know what version of CentOS you're using? I used CentOS 7, and I
haven't tried this installation script on other versions of CentOS. I never
got the libtool error, so I'm sorry that I don't know what the exact
solution is. If you are using CentOS 7, perhaps there have been updates to
packages (like libtool) that make installation a little trickier.

Good luck! If you discover the solution, feel free to post it here.
Eugene



On Thu, Sep 6, 2018 at 11:54 PM Periasamy Kanagavel <
periasamy.kanaga...@gmail.com> wrote:

> I am new to Cent OS. I am trying the steps mentioned here. Upto the step "
> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig ...", there were no issues.
> While running the command "LDFLAGS="-L/usr/local/lib" CFLAGS=
> "-I/usr/local/include" make -j", I was getting "libtool: Version mismatch
> error.  This is libtool 2.4.6, but the" error. Am I missing anything?
>
> On Monday, April 23, 2018 at 11:52:40 PM UTC+5:30, Eugene Huang wrote:
>>
>> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and
>> Windows. Unfortunately, there are no clear instructions on installing
>> Tesseract 4 for other flavors of Linux--probably most notably CentOS and
>> Red Hat.
>>
>> After going through dependency hell, I successfully installed Tesseract 4
>> onto CentOS 7. I presume that the installation script should also work for
>> Red Hat. I want to give credit to EisenVault because this script is
>> essentially a modified version of his script. This is my first contribution
>> to open source software, so any tips will be highly appreciated!
>>
>> When running this script line by line, you probably have to prefix "sudo"
>> to each line, or you can copy and paste into a bash script and then run
>> sudo along with the script. I have tested both to work on a fresh image of
>> CentOS 7 on VirtualBox.
>>
>> Cheers!
>>
>> # (Estimated Time of Completion: 45 minutes)
>> # Instructions taken (and slightly modified) from
>> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh
>> cd /opt
>> # The following line will take 30 minutes to install.
>> yum -y update
>> yum -y install libstdc++ autoconf automake libtool autoconf-archive 
>> pkg-config
>> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
>> yum group install -y "Development Tools"
>>
>>
>> # Install Leptonica from Source
>> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz
>> tar -zxvf leptonica-1.75.3.tar.gz
>> cd leptonica-1.75.3
>> ./autobuild
>> ./configure
>> make -j
>> make install
>> cd ..
>> # Delete tar.gz file if you like
>>
>>
>> # Sanity checks
>> # check if libpng is installed: type "whereis libpng" and expect to see a
>> directory; a blank line is not good
>> # check if leptonica is installed: type "ls /usr/local/include" and
>> expect to see "leptonica"
>>
>>
>> # Install Tesseract from Source
>> wget https://
>> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz
>> tar -zxvf 4.0.0-beta.1.tar.gz
>> cd tesseract-4.0.0-beta.1/
>> ./autogen.sh
>> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig 
>> LIBLEPT_HEADERSDIR=/usr/local/include
>> ./configure --with-extra-includes=/usr/local/include --with-extra-
>> libraries=/usr/local/lib
>> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
>> make install
>> ldconfig
>> cd ..
>> # Delete tar.gz file if you like
>>
>>
>> # Download and install tesseract language files (Tesseract 4 traineddata
>> files)
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
>> wget https://
>> github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata
>> # download another other languages you like
>> mv *.traineddata /usr/local/share/tessdata
>>
>>
>> # Sanity check
>> # check if tesseract is installed: type "tesseract --version" and expect
>> to see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for
>> images)
>>
>> --
> You received this message because you are subscribed to a topic in the
> Google Groups "tesseract-ocr" group.
> To unsubscribe from this topic, visit
> https://groups.google.com/d/topic/tesseract-ocr/u-PZaakaKs0

[tesseract-ocr] Re: Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2018-04-25 Thread Eugene Huang
Hello Александр!

I took a look at your stuff; it is very extensive. If all the installations 
work, this should be front-paged! I have never used openSUSE. Could you 
point me to some resources to figure out how use your installation packages?


@shree
Thanks for the info. I definitely notice that Tesseract 4 is more 
accurate--more example, Tesseract 4 can read small italics font whereas 
Tesseract 3 makes lots of mistakes. Seems like Tesseract 4 is the future!

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/099cf0e9-157f-4f70-8774-0a0f928a3054%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2018-04-24 Thread Eugene Huang
@Shree
Thanks for the tip. Just 2 quick questions. 
1) From https://github.com/tesseract-ocr/tesseract/wiki/Data-Files, it says 
that "osd" and "equ" traineddata files are compatible between Tesseract 3 
and 4. In the GitHub tessdata_fast repo 
(https://github.com/tesseract-ocr/tessdata_fast), "osd" is there with the 
commit "Use legacy Orientation Script Detector (OSD) because that is the 
only thing that currently works." However, "equ" is not in the repo. Was 
this simply a small mistake where the maintainer forgot to include the 
"equ" data file?

2) Also, with tessdata_fast, I was able to get Tesseract 4 running faster 
than using Tesseract 4 with tessdata. However, is Tesseract 4 supposed to 
be slower than Tesseract 3 because that's what I'm experiencing?




# Here are the updated instructions to download tessdata_fast, which I 
tested to indeed perform faster than tessdata.
# However, when calling Tesseract from the command line, using the 
arguments "--oem 2" will no longer work. 
# Use "--oem 1" since only the neural net LSTM model exists if using 
tessdata_fast.
wget 
https://github.com/tesseract-ocr/tessdata_fast/blob/master/osd.traineddata?raw=true
wget 
https://github.com/tesseract-ocr/tessdata_fast/blob/master/eng.traineddata?raw=true
wget 
https://github.com/tesseract-ocr/tessdata_fast/blob/master/chi_sim.traineddata?raw=true

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/47f3b497-84fb-4aed-9766-877053e8a293%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.


Re: [tesseract-ocr] Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2018-04-24 Thread Eugene Huang
@Shree
Thanks for the tip. Just 2 quick questions. 
1) From https://github.com/tesseract-ocr/tesseract/wiki/Data-Files, it says 
that "osd" and "equ" traineddata files are compatible between Tesseract 3 
and 4. In the GitHub tessdata_fast repo 
(https://github.com/tesseract-ocr/tessdata_fast), "osd" is there with the 
commit "Use legacy Orientation Script Detector (OSD) because that is the 
only thing that currently works." However, "equ" is not in the repo. Was 
this simply a small mistake where the maintainer forgot to include the 
"equ" data file?

2) Also, with tessdata_fast, I was able to get Tesseract 4 running faster 
than using Tesseract 4 with tessdata. However, is Tesseract 4 supposed to 
be slower than Tesseract 3 because that's what I'm experiencing?




# Here are the updated instructions to download tessdata_fast, which I 
tested to indeed perform faster than tessdata.
# However, when calling Tesseract from the command line, using the 
arguments "--oem 2" will no longer work. 
# Use "--oem 1" since only the neural net LSTM model exists if using 
tessdata_fast.
wget 
https://github.com/tesseract-ocr/tessdata_fast/blob/master/osd.traineddata?raw=true
wget 
https://github.com/tesseract-ocr/tessdata_fast/blob/master/eng.traineddata?raw=true
wget 
https://github.com/tesseract-ocr/tessdata_fast/blob/master/chi_sim.traineddata?raw=true


On Monday, April 23, 2018 at 2:37:09 PM UTC-4, shree wrote:
>
> Thanks for the script to install tesseract on CentOS.
>
> I would suggest using traineddata files from tessdata_fast or 
> tessdata_best repos for better accuracy and speed.
>
> On Mon 23 Apr, 2018, 11:52 PM Eugene Huang, <eugen...@gmail.com 
> > wrote:
>
>> Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and 
>> Windows. Unfortunately, there are no clear instructions on installing 
>> Tesseract 4 for other flavors of Linux--probably most notably CentOS and 
>> Red Hat.
>>
>> After going through dependency hell, I successfully installed Tesseract 4 
>> onto CentOS 7. I presume that the installation script should also work for 
>> Red Hat. I want to give credit to EisenVault because this script is 
>> essentially a modified version of his script. This is my first contribution 
>> to open source software, so any tips will be highly appreciated!
>>
>> When running this script line by line, you probably have to prefix "sudo" 
>> to each line, or you can copy and paste into a bash script and then run 
>> sudo along with the script. I have tested both to work on a fresh image of 
>> CentOS 7 on VirtualBox.
>>
>> Cheers!
>>
>> # (Estimated Time of Completion: 45 minutes)
>> # Instructions taken (and slightly modified) from 
>> https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh
>> cd /opt
>> # The following line will take 30 minutes to install.
>> yum -y update 
>> yum -y install libstdc++ autoconf automake libtool autoconf-archive 
>> pkg-config 
>> gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
>> yum group install -y "Development Tools"
>>
>>
>> # Install Leptonica from Source
>> wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz
>> tar -zxvf leptonica-1.75.3.tar.gz
>> cd leptonica-1.75.3
>> ./autobuild
>> ./configure
>> make -j
>> make install
>> cd ..
>> # Delete tar.gz file if you like
>>
>>
>> # Sanity checks
>> # check if libpng is installed: type "whereis libpng" and expect to see a 
>> directory; a blank line is not good
>> # check if leptonica is installed: type "ls /usr/local/include" and 
>> expect to see "leptonica"
>>
>>
>> # Install Tesseract from Source
>> wget https://
>> github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz
>> tar -zxvf 4.0.0-beta.1.tar.gz
>> cd tesseract-4.0.0-beta.1/
>> ./autogen.sh
>> PKG_CONFIG_PATH=/usr/local/lib/pkgconfig 
>> LIBLEPT_HEADERSDIR=/usr/local/include 
>> ./configure --with-extra-includes=/usr/local/include --with-extra-
>> libraries=/usr/local/lib
>> LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
>> make install
>> ldconfig
>> cd ..
>> # Delete tar.gz file if you like
>>
>>
>> # Download and install tesseract language files (Tesseract 4 traineddata 
>> files)
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata
>> wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata
>> wget https://github.

[tesseract-ocr] Install Tesseract 4 on CentOS and Red Hat [SOLVED!]

2018-04-23 Thread Eugene Huang
Hello! Most people are probably running Tesseract 4 on Ubuntu, MacOS, and 
Windows. Unfortunately, there are no clear instructions on installing 
Tesseract 4 for other flavors of Linux--probably most notably CentOS and 
Red Hat.

After going through dependency hell, I successfully installed Tesseract 4 
onto CentOS 7. I presume that the installation script should also work for 
Red Hat. I want to give credit to EisenVault because this script is 
essentially a modified version of his script. This is my first contribution 
to open source software, so any tips will be highly appreciated!

When running this script line by line, you probably have to prefix "sudo" 
to each line, or you can copy and paste into a bash script and then run 
sudo along with the script. I have tested both to work on a fresh image of 
CentOS 7 on VirtualBox.

Cheers!

# (Estimated Time of Completion: 45 minutes)
# Instructions taken (and slightly modified) from 
https://github.com/EisenVault/install-tesseract-redhat-centos/blob/master/install-tesseract.sh
cd /opt
# The following line will take 30 minutes to install.
yum -y update 
yum -y install libstdc++ autoconf automake libtool autoconf-archive pkg-config 
gcc gcc-c++ make libjpeg-devel libpng-devel libtiff-devel zlib-devel
yum group install -y "Development Tools"


# Install Leptonica from Source
wget http://www.leptonica.com/source/leptonica-1.75.3.tar.gz
tar -zxvf leptonica-1.75.3.tar.gz
cd leptonica-1.75.3
./autobuild
./configure
make -j
make install
cd ..
# Delete tar.gz file if you like


# Sanity checks
# check if libpng is installed: type "whereis libpng" and expect to see a 
directory; a blank line is not good
# check if leptonica is installed: type "ls /usr/local/include" and expect 
to see "leptonica"


# Install Tesseract from Source
wget https://github.com/tesseract-ocr/tesseract/archive/4.0.0-beta.1.tar.gz
tar -zxvf 4.0.0-beta.1.tar.gz
cd tesseract-4.0.0-beta.1/
./autogen.sh
PKG_CONFIG_PATH=/usr/local/lib/pkgconfig LIBLEPT_HEADERSDIR=/usr/local/include 
./configure --with-extra-includes=/usr/local/include --with-extra-libraries=
/usr/local/lib
LDFLAGS="-L/usr/local/lib" CFLAGS="-I/usr/local/include" make -j
make install
ldconfig
cd ..
# Delete tar.gz file if you like


# Download and install tesseract language files (Tesseract 4 traineddata 
files)
wget https://github.com/tesseract-ocr/tessdata/raw/master/osd.traineddata
wget https://github.com/tesseract-ocr/tessdata/raw/master/equ.traineddata
wget https://github.com/tesseract-ocr/tessdata/raw/master/eng.traineddata
wget https:
//github.com/tesseract-ocr/tessdata/raw/master/chi_sim.traineddata
# download another other languages you like
mv *.traineddata /usr/local/share/tessdata


# Sanity check
# check if tesseract is installed: type "tesseract --version" and expect to 
see 1st line (tesseract), 2nd line (leptonica), 3rd line(libraries for 
images)

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/d41ebcc5-b3b1-4e66-af8a-c7896814a7cc%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.