>
> Hm, in Norwegian it isn't that rare. Or at least shouldn't be ;). Æ is
>> the uppercase version of æ, and it would never occur in the middle of a
>> word.
>>
>
> I find it strange that it has been left out alltogether. What must I do to
>> get it in there?
>>
>
tirsdag 3. januar 2017
p.s. If you post some example images, I'm happy to knock together a quick
example for you.
It looks like the native file format is AVI and AVI files have the ability
to incorporate streams of not only video and audio, but also closed
captioning info and other metadata. Is it safe to assume
On Monday, January 2, 2017 at 12:49:38 AM UTC-5, jean-charles compagnon
wrote:
>
> I have attached the captcha that I cannot decode.
>
It says YNXAJB.
Or do you mean your computer program can't decode it? In that case, it
sounds like it is working as intended.
--
You received this message
First, the latest version is 3.04 (although there's also a tag for 3.05).
Second, there will soon (hopefully) be a release for 4.00 which will make
3.x obsolete.
Having said that, it looks like the root cause of your problem is that
Tesseract doesn't know Æ is a possible letter for Norwegian.
I'm still hoping to learn how to use GetComponentImages / SetRectangle
better, but I found a workaround to get what I need out of GetIterator /
iterate_level... BoundingBoxInternal is not something I can find
documentation for, but I saw a reference to it
I'm still hoping to learn how to use GetComponentImages / SetRectangle
better, but I found a workaround to get what I need out of GetIterator /
iterate_level... BoundingBoxInternal is not something I can find
documentation for, but I saw a reference to it
Greetings and salutations fellow OCR'ers ;).
I have been playing around with various modules in PowerShell for reading
text from an image with PowerShell but I have landed on using tesseract
directly. It all works fine, and it reads like a dream :). However, it
seems it is having problems with
I've continued to spend a little time each day working on my problem. I've
found something that fuels my desire to understand what GetComponentImages
does differently from iterate_level.
from PIL import Image
Image.MAX_IMAGE_PIXELS=10
from tesserocr import PyTessBaseAPI, RIL
image =
The whole point of a captcha is to evade automated reading. That's why letters
are very close together and letters are heavily rotated off a consistent
baseline. OCR is designed for normal text input so you need to do clever
preprocessing here first.
Sent from my iPhone
> On 2 Jan 2017, at
9 matches
Mail list logo