--Push--
does anyone have an idea?
thanks for help!
Am Sonntag, 8. September 2019 12:23:28 UTC+2 schrieb test0r man:
>
> hi,
> i use this command:
>
> tesseract input/image.jpg output/output --dpi 72 --oem 1 -l deu+eng
>
> to scan image like "1_input.jpg" and "2_input.jpg". the ocr result is
I didn’t try these images but my first guess: can you not provide dpi 72 as
option and try?
Sent from my iPhone
> On Oct 5, 2019, at 4:04 AM, test0r man wrote:
>
> --Push--
>
> does anyone have an idea?
>
> thanks for help!
>
>
> Am Sonntag, 8. September 2019 12:23:28 UTC+2 schrieb test0r
https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy
Gimp is your friend.
From: tesseract-ocr@googlegroups.com [mailto:tesseract-ocr@googlegroups.com] On
Behalf Of Ravi Annaswamy
Sent: 05 October 2019 11:08
To: tesseract-ocr@googlegroups.com
Subject:
tesseract 2_input_cropped.png - --psm 6 --oem 0
6.
7.
8.
9.
10.
Zdenko
so 5. 10. 2019 o 10:04 test0r man napísal(a):
> --Push--
>
> does anyone have an idea?
>
> thanks for help!
>
>
> Am Sonntag, 8. September 2019 12:23:28 UTC+2 schrieb test0r man:
>>
>> hi,
>> i use this command:
>>
>>
Seems this bash script (legacy.sh) is responsible for the mapping of
non-Unicode fonts with legacy mapping (as a legacy to Unicode converter).
And seems this script file is responsible for the generation of the box,tif
and lstmf files. Am I right? so where should I place this script file in
i've tried without the 72 dpi option. the result on the first image is a
bit bader. on the second image no change
Am Samstag, 5. Oktober 2019 12:08:35 UTC+2 schrieb Ravi Annaswamy:
>
> I didn’t try these images but my first guess: can you not provide dpi 72
> as option and try?
>
> Sent from my
thanks for your test. i set the border with imagemagick for a better result
on the first image. tesseract detects with psm 6 all numbers right, but
only on the second image. have you tried the first image too?
Am Samstag, 5. Oktober 2019 14:52:15 UTC+2 schrieb zdenop:
>
>
> tesseract
If you use linux, you can try similar to attached bash script.
On Thu, Oct 3, 2019 at 2:55 PM Shree Devi Kumar
wrote:
> There is no direct method for training from non-unicode fonts. Tesseract's
> output is also Unicode text only.
>
> You can work from scanned images of text in non-unicode
end is typo ;-) should be read as eng :-)
Dňa so 5. 10. 2019, 21:31 test0r man napísal(a):
> Hi Zdenko,
>
> very good job! i've tried so many image manipulation, but this was the
> wrong way for the problems 1-3. the idea with the uzn file is great and i
> think the perfect solution. Thanks :-)
thanks for the link. i will read and try it
Am Samstag, 5. Oktober 2019 14:38:26 UTC+2 schrieb Testing Windows
Screenshots:
>
>
> https://stackoverflow.com/questions/9480013/image-processing-to-improve-tesseract-ocr-accuracy
>
>
>
> Gimp is your friend.
>
>
>
> *From:*
Hi Akos,
depends from which period you want to OCR Fraktur. Before 1750 you cannot
expect very good results.
This one is around 1770 in Fraktur (similar Breitkopffraktur) and not so
bad:
Hi Zdenko,
very good job! i've tried so many image manipulation, but this was the
wrong way for the problems 1-3. the idea with the uzn file is great and i
think the perfect solution. Thanks :-)
i can confirm that scaling these image doesn't helped (more than 30 pixel
per letter is the right
12 matches
Mail list logo