Probably your issue is contingent with this
one: https://github.com/tesseract-ocr/tesseract/issues/792
Are you in Windows or Ubuntu?
You might try by upgrading tesseract to version 5. I am not well versed
into tesseract. So, my knowledge is very limited.
On Thursday, November 23, 2023 at
the tif files are not corrupted and box files are not of size zero
On Thursday, November 23, 2023 at 12:51:49 PM UTC+5:30 desal...@gmail.com
wrote:
> Make sure that the tif files are not corrupted; or the box files are not
> zero size.
>
> Des
>
> On 23 Nov 2023 at 9:26:39 AM, Adepu Sai
Make sure that the tif files are not corrupted; or the box files are not
zero size.
Des
On 23 Nov 2023 at 9:26:39 AM, Adepu Sai Rahul
wrote:
>
> chinnu@SaiRahul2507:~/tesseract_tutorial/tesstrain$
> TESSDATA_PREFIX=../tesseract/tessdata make training MODEL_NAME=Y145
> START_MODEL=eng
chinnu@SaiRahul2507:~/tesseract_tutorial/tesstrain$
TESSDATA_PREFIX=../tesseract/tessdata make training MODEL_NAME=Y145
START_MODEL=eng TESSDATA=../tesseract/tessdata MAX_ITERATIONS=200
You are using make version: 4.3
lstmtraining \
--debug_interval 0 \
--traineddata
The character rate is the most common measure of the quality of your
training.
- train with large data. Run it on a couple of epochs; so that your CER
will be as close as 0.01. That is the most common strategy.
On Wednesday, November 22, 2023 at 4:50:45 PM UTC+3 smon...@gmail.com wrote:
> As
Most people seem to watch the character error. That is supposed to be the
most important indicator of accuracy. I think character error of less than
1% is what is mostly sought for.
On Wednesday, November 22, 2023 at 4:50:45 PM UTC+3 smon...@gmail.com wrote:
> As I am training my model I got
>From my limited experience, you need a lot more data than that to train
from scratch. If you can't make more than that data, you might first try to
fine tune:and then train by removing the top layer of the best model.
On Wednesday, November 22, 2023 at 4:46:53 PM UTC+3 smon...@gmail.com
As I am training my model I got in contact with the following metrics:
E.g.:
At iteration 6345/6500/6500, Mean rms=6.246%, delta=7.139%, char
train=68.07%, word train=92.2%, skip ratio=0%, New best char error = 68.07
wrote checkpoint.
Unfortunately I don't find any proper and detailed
As it is not properly possible to combine my traineddata from scratch with
an existing one, I have decided to also train my traineddata model numbers.
Therefore I wrote a script which synthetically generates groundtruth data
with text2image.
This script uses dozens of different fonts and
9 matches
Mail list logo