What is the correct way to change the training text from a traineddata that
I'm working?
I'm training an new traineddata and it started to get some results, but now
I want to change the text used to train it and continue from where I
stopped. How can I do it?
--
You received this message
And if I look at the "kor.unicharset" created after executing
"training/tesstrain.sh" it only contains the korean characters, even after
I changing "kor.lstm-unicharset" from the "kor.traineddata"
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr"
training_text``` for tests
>
> You need to go through the complete training process after this. Only then
> both set of characters will reflected in it.
>
> You can try add a layer training with tessdata_best/kor.traineddata to
> continue from.
>
> ShreeDevi
> __
I'm trying to add Chinese to my Korean charset, but I'm not able to do it.
Obs.: Since Korean can use some Chinese characters (hanja) I'm merging the
```kor.training_text``` with the ```chi_tra.training_text``` for tests
Reference:
https://en.wikipedia.org/wiki/Hanja
on
an image that has Korean and Chinese it is going to recognize some Korean
characters as Chinese and some Chinese characters as Korean.
On Monday, 9 April 2018 05:15:57 UTC-3, shree wrote:
>
> Leftover from 3.04, my guess.
>
> On Mon 9 Apr, 2018, 12:52 PM Fanatico, <fana
Thanks, I was going to do this, just to be sure if there wasn't a way to
train 2 traineddata like the actual.
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to
I want to train fo kor+chi how can I do it?
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group,
wen I asked about passing the ".training_text" as a param, I meant in the
creation of the training data "training/tesstrain.sh"
On Tuesday, 10 April 2018 13:30:05 UTC-3, Fanatico wrote:
>
> I just thought, but can I pass only the ".training_text" file
I see, thanks for the reply.
On Tuesday, 10 April 2018 11:45:59 UTC-3, Fanatico wrote:
>
> Platform: MAC OS X
> Tesseract: 4.0.0-beta.1-69-g10f4
>
> Wen I execute a command like:
>
> SCROLLVIEW_PATH=~/projects/tesseract/java \
> ~/projects/tessera
try this code in the console:
brew info tesseract
This must return some info, one these infos is the path where your
tesseract is installed copy it and execute this code on your console:
export TESSDATA_PREFIX=[the path you just copied]
try to execute your code again, if it works you can past
Platform: MAC OS X
Tesseract: 4.0.0-beta.1-69-g10f4
Wen I execute a command like:
SCROLLVIEW_PATH=~/projects/tesseract/java \
~/projects/tesseract/training/lstmtraining \
--debug_interval 100 \
--continue_from ~/projects/ocr/training/kortrain/kor_from_full/kor.lstm
\
--traineddata
You installed it using brew or compiled it yourself?
try to type this in the terminal and post here the result
echo $TESSDATA_PREFIX
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails
The conf from kor did already have it
#Fixes https://github.com/tesseract-ocr/tesseract/issues/1009
preserve_interword_spaces 1
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it,
9 Apr, 2018, 11:48 AM Fanatico, <fanati...@gmail.com >
> wrote:
>
>> I used one traineddata that I created on removing the top layer from the
>> kor.traineddata from "tessdata_best", after this I replaced this
>> traineddata with the one from "tess
I used one traineddata that I created on removing the top layer from the
kor.traineddata from "tessdata_best", after this I replaced this
traineddata with the one from "tessdata_best" and got the same problem.
Yes, it include chi_tra as sublanguage
tessedit_load_sublangs chi_tra
I'm running tesseract with the "-l kor" param but it is detecting some
chinese characters, the image really have 3 chinese characters but none of
them is returning correctly (and I'm not expecting them to return
correctly) but the others korean characters are being recognized as chinese
I just posted at the repo issues a step to step that I needed to do so I
could use tessercat 4.0 from my MAC, so I'm just sharing the link in case
someone has the same problems I got.
Obs.: It can save a few days of your life
https://github.com/tesseract-ocr/tesseract/issues/1453
--
You
I managed to build it, but I needed to clone the repo and build it to use.
So I don't recommend to install tesseract using brew
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it,
from the java folder "cd ~/projects/tesseract/java" in my case
On Saturday, 7 April 2018 12:40:29 UTC-3, shree wrote:
>
> Please see
> https://github.com/tesseract-ocr/tesseract/blob/master/Makefile.am
>
> From which dir did you try
>
> make ScrollView.jar
>
> ShreeDevi
>
Hi. I finally got the training from 4.o to work, but I was unable to build
the ScrollView.jar so Im currently running the test with "--debug_interval
-1". Can someone help Me?
Sistem
Platform: MAC OS X 10.13.3 (installed with brew)
Tesseract: 4.0.0-beta.1
leptonica: 1.75.3
libjpeg 9c :
se look here:
https://github.com/tesseract-ocr/tesseract/issues/736
On Saturday, 7 April 2018 04:35:36 UTC-3, shree wrote:
>
> Look in your tmp directory in the sub folders referred in the console
> output
>
> Check the log file and other files there
>
> On Sat 7 Apr, 2018
Yes the location is correct, I tried to put the full path to the folder
and go the same error.
Im just cloned the https://github.com/tesseract-ocr/langdata repo
On Friday, 6 April 2018 23:28:06 UTC-3, shree wrote:
>
> Is your langdata in --langdata_dir ../../langdata
>
>>
>>
>>
--
You
I'm trying to execute the training from the 4.o tutorial, but I'm getting
an error, can someone help with this?
Platform: MAC OS X 10.13.3
Tesseract: 4.0.0-beta.1
leptonica: 1.75.3
libjpeg 9c : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11
Code used
../../tesseract/training/tesstrain.sh \
Thanks for the quick response, I did not see this part in the documentation
...
My problem is that in the image "kor.AppleMyungjo.exp0.tif" the tesseract
is recognizing nothing, the box file is empty and in the image
"kor.AppleMyungjo.exp1.tif" it is not recognizing the last quotation marks
Hi, I'm new to tesseract and ocr in general, and need some help to train my
tesseract.
Config
Platform: Mac OS X 10.13.3
Tesseract Version: 4.0.0-beta.1
leptonica: 1.75.3
libjpeg 9c : libpng 1.6.34 : libtiff 4.0.9 : zlib 1.2.11
images used
kor.AppleMyungjo.exp1.tif
25 matches
Mail list logo