i tried learning some opencv and doing the mask thing:
boxes = [
(45, 0, 245, im.height),
(320, 0, 515, im.height),
(600, 0, 785, im.height),
]
if im.width > 1000:
boxes.append(
(865, 0, 1065, im.height)
)
mask = np.zeros(data.shape[:2], np.uint8)
for box in boxes:
cv2.rectangle(mask, (box[0], box[1]), (box[2], box[3]), 255, -1)
mask2 = np.zeros(data.shape[:2], np.uint8)
boxes = [
(0, 58, im.width, 110),
(0, 312, im.width, 360)
]
for box in boxes:
cv2.rectangle(mask2, (box[0], box[1]), (box[2], box[3]), 255, -1)
mask = cv2.bitwise_and(mask, mask2)
image_final = cv2.bitwise_and(data, data, mask=mask)
image_final = cv2.threshold(cv2.cvtColor(image_final, cv2.COLOR_BGR2GRAY),
0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
mask1 = np.zeros((image_final.shape[0] + 2, image_final.shape[1] + 2),
np.uint8)
cv2.floodFill(image_final, mask1, (0, 0), 255)
the results aren't that good and i don't know if this is a good way to make
a mask.
On Monday, January 3, 2022 at 5:07:00 PM UTC-8 Cyrus Yip wrote:
> for this image
> [image: drop12.png]
> it still fails to get the text from the bottom right
> cards:
> ['MasumiMushishiZokuShou', 'TamaoHino*Eyeshield21',
> "DiegoBrando~'sBizarreAdi:tocolBalRan", '']
>
> On Monday, January 3, 2022 at 10:50:42 AM UTC-8 zdenop wrote:
>
>> increase parameter in getStructuringElement from 4 to 5 when creating
>> mask:
>>
>> kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 5))
>>
>>
>> Zdenko
>>
>>
>> po 3. 1. 2022 o 0:08 Cyrus Yip <[email protected]> napísal(a):
>>
>>> Ok, I will look into how to do that. But do you have an idea why some of
>>> the letters go missing?
>>>
>>> On Sunday, January 2, 2022 at 1:10:45 PM UTC-8 zdenop wrote:
>>>
>>>> All images you presented have the same size and the text is always in
>>>> the same regions.
>>>> So you can create a mask for these regions and apply it to the
>>>> thresholded input images. This could give you extra speed as you do not
>>>> need to create a mask for each image individually...
>>>>
>>>> Zdenko
>>>>
>>>>
>>>> ne 2. 1. 2022 o 21:01 Cyrus Yip <[email protected]> napísal(a):
>>>>
>>>>> I tried the opencv version, but it fails with images like this:
>>>>> [image: drop12.png][image: hi.png]
>>>>>
>>>>> On Saturday, January 1, 2022 at 12:29:34 PM UTC-8 zdenop wrote:
>>>>>
>>>>>> And here is opencv2 version with IMO better quality:
>>>>>>
>>>>>>
>>>>>> import cv2
>>>>>> data = cv2.imread("mina.png")
>>>>>> mask_text = cv2.inRange(data, (51, 51, 51), (51, 51, 51))
>>>>>>
>>>>>> # Morph open to remove noise
>>>>>> kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
>>>>>> morph = cv2.morphologyEx(mask_text, cv2.MORPH_OPEN, kernel,
>>>>>> iterations=1)
>>>>>>
>>>>>> kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (9, 4))
>>>>>> dilate = cv2.dilate(morph, kernel, iterations=4)
>>>>>>
>>>>>> tresh = cv2.threshold(cv2.cvtColor(data, cv2.COLOR_BGR2GRAY),
>>>>>> 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)[1]
>>>>>> image_final = cv2.bitwise_and(tresh, tresh, mask=dilate)
>>>>>> # replace background with white
>>>>>> mask1 = np.zeros(( image_final.shape[0] + 2, image_final.shape[1] +
>>>>>> 2), np.uint8)
>>>>>> cv2.floodFill(image_final, mask1, (0, 0), 255)
>>>>>>
>>>>>> display(Image.fromarray(image_final))
>>>>>>
>>>>>>
>>>>>> [image: image.png]
>>>>>>
>>>>>>
>>>>>> Zdenko
>>>>>>
>>>>>>
>>>>>> so 1. 1. 2022 o 20:40 Zdenko Podobny <[email protected]> napísal(a):
>>>>>>
>>>>>>> What is your code? Does it work on your local computer?
>>>>>>>
>>>>>>> BTW: here is proven numpy code:
>>>>>>>
>>>>>>> filter_colors = [(51, 51, 51), (69, 69, 65), (65, 64, 60), (59, 58,
>>>>>>> 56), (67, 66, 62),
>>>>>>> (67, 67, 63), (67, 67, 62), (53, 53, 53), (54, 54, 53),
>>>>>>> (61, 61, 58),
>>>>>>> (62, 62, 60), (55, 55, 54), (59, 59, 57), (56, 56, 55)]
>>>>>>>
>>>>>>> image = np.array(Image.open('mina.png').convert("RGB"))
>>>>>>>
>>>>>>> *A, B = image.shape
>>>>>>> mask = (image.reshape((-1,B)) ==
>>>>>>> np.array(filter_colors)[:,None]).all(-1).any(0).reshape(A)
>>>>>>> img = Image.fromarray(~mask)
>>>>>>>
>>>>>>>
>>>>>>> Zdenko
>>>>>>>
>>>>>>>
>>>>>>> so 1. 1. 2022 o 19:49 Cyrus Yip <[email protected]> napísal(a):
>>>>>>>
>>>>>>>> i managed to install tesseract 5, but the numpy mask doesn't work
>>>>>>>> now.
>>>>>>>> it makes pictures like:
>>>>>>>> [image: image.png]
>>>>>>>> not:
>>>>>>>> [image: image.png]
>>>>>>>>
>>>>>>>>
>>>>>>>> Dockerfile:
>>>>>>>> # syntax=docker/dockerfile:1 ARG TOKEN FROM ubuntu:18.04 RUN
>>>>>>>> apt-get update RUN apt-get install -y software-properties-common
>>>>>>>> RUN apt-get install -y python3.8 RUN apt-get install -y python3-pip
>>>>>>>> RUN apt-get update RUN apt-get install -y build-essential RUN
>>>>>>>> apt-get install -y python3-pil COPY requirements.txt
>>>>>>>> requirements.txt RUN pip3 install -r requirements.txt RUN apt-get
>>>>>>>> update RUN add-apt-repository ppa:alex-p/tesseract-ocr5 RUN
>>>>>>>> apt-get update RUN apt-get install -y tesseract-ocr COPY . . CMD
>>>>>>>> ["python3", "bot.py"]
>>>>>>>>
>>>>>>>> On Friday, December 31, 2021 at 10:29:59 AM UTC-8 Cyrus Yip wrote:
>>>>>>>>
>>>>>>>>> better link?
>>>>>>>>> <https://www.toptal.com/developers/hastebin/nonepalihe>
>>>>>>>>>
>>>>>>>>> On Friday, December 31, 2021 at 10:27:41 AM UTC-8 Cyrus Yip wrote:
>>>>>>>>>
>>>>>>>>>> Right now I'm installing tesseract 4 in docker with
>>>>>>>>>> RUN apt-get install -y tesseract-ocr
>>>>>>>>>> That might be a reason why it's way slower than on my computer,
>>>>>>>>>> how can I install tesseract 5?
>>>>>>>>>>
>>>>>>>>>> Dockerfile # syntax=docker/dockerfile:1
>>>>>>>>>>
>>>>>>>>>> ARG TOKEN
>>>>>>>>>>
>>>>>>>>>> FROM python:3.8-slim-buster
>>>>>>>>>>
>>>>>>>>>> RUN apt-get update
>>>>>>>>>> RUN apt-get install -y software-properties-common
>>>>>>>>>> RUN apt-get update
>>>>>>>>>> RUN add-apt-repository ppa:alex-p/tesseract-ocr-devel
>>>>>>>>>>
>>>>>>>>>> RUN apt-get update
>>>>>>>>>> RUN apt-get install -y build-essential
>>>>>>>>>>
>>>>>>>>>> COPY requirements.txt requirements.txt
>>>>>>>>>> RUN pip3 install -r requirements.txt
>>>>>>>>>>
>>>>>>>>>> COPY . .
>>>>>>>>>>
>>>>>>>>>> RUN apt-get install -y tesseract
>>>>>>>>>>
>>>>>>>>>> CMD ["python3", "bot.py"]
>>>>>>>>>>
>>>>>>>>>> Build logs
>>>>>>>>>> <https://appbuild-logs-ams3.ams3.digitaloceanspaces.com/a7609af2-64e1-4ba2-8555-87a4fac8a37f/9420eaef-131e-410f-8add-bbfb870b2693/981a4c35-45d7-41b5-8619-3d9125d60c25/build.log?X-Amz-Algorithm=AWS4-HMAC-SHA256&X-Amz-Credential=2JPIHVK4OTM6S5VRFBCK%2F20211231%2Fams3%2Fs3%2Faws4_request&X-Amz-Date=20211231T182608Z&X-Amz-Expires=900&X-Amz-SignedHeaders=host&X-Amz-Signature=3ae248ce9fb9e6fef0c71955d9cd9496feb8311162bdda8921750a21544f79a6>
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> On Friday, December 31, 2021 at 3:18:18 AM UTC-8 zdenop wrote:
>>>>>>>>>>
>>>>>>>>>>> You are right - np.isin is working another way than I expected
>>>>>>>>>>> (it does not match tuples, but individual values at tuples) and by
>>>>>>>>>>> coincidence, it produces similar results as your code.
>>>>>>>>>>>
>>>>>>>>>>> Here is updated code that produces the same result as PIL. It is
>>>>>>>>>>> faster but with an increasing number of colors in filter_colors,
>>>>>>>>>>> it will
>>>>>>>>>>> be slower.
>>>>>>>>>>>
>>>>>>>>>>> filter_colors = [(51, 51, 51), (69, 69, 65), (65, 64, 60), (59,
>>>>>>>>>>> 58, 56), (67, 66, 62),
>>>>>>>>>>> (67, 67, 63), (67, 67, 62), (53, 53, 53), (54, 54,
>>>>>>>>>>> 53), (61, 61, 58),
>>>>>>>>>>> (62, 62, 60), (55, 55, 54), (59, 59, 57), (56, 56, 55)]
>>>>>>>>>>>
>>>>>>>>>>> image = np.array(Image.open('mai.png').convert("RGB"))
>>>>>>>>>>> mask = np.array([], dtype=bool)
>>>>>>>>>>> for color in filter_colors:
>>>>>>>>>>> if mask.size == 0:
>>>>>>>>>>> mask = (image == color).all(-1)
>>>>>>>>>>> else:
>>>>>>>>>>> mask = mask | (image == color).all(-1)
>>>>>>>>>>> img = Image.fromarray(~mask)
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> Zdenko
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>> pi 31. 12. 2021 o 1:45 Cyrus Yip <[email protected]>
>>>>>>>>>>> napísal(a):
>>>>>>>>>>>
>>>>>>>>>>>> For some reason, using the numpy array has a different result
>>>>>>>>>>>> than mine.
>>>>>>>>>>>>
>>>>>>>>>>>> Numpy array:
>>>>>>>>>>>>
>>>>>>>>>>>> [image: hi.png]
>>>>>>>>>>>> Loop through pixels:
>>>>>>>>>>>> [image: hi.png]
>>>>>>>>>>>> The second was is more accurate but way slower.
>>>>>>>>>>>> On Thursday, December 30, 2021 at 11:43:01 AM UTC-8 zdenop
>>>>>>>>>>>> wrote:
>>>>>>>>>>>>
>>>>>>>>>>>>> try this:
>>>>>>>>>>>>>
>>>>>>>>>>>>> import numpy as np
>>>>>>>>>>>>> from PIL import Image
>>>>>>>>>>>>>
>>>>>>>>>>>>> filter_colors = [(51, 51, 51), (69, 69, 65), (65, 64, 60),
>>>>>>>>>>>>> (59, 58, 56), (67, 66, 62),
>>>>>>>>>>>>>
>>>>>>>>>>>>> (67, 67, 63), (67, 67, 62), (53, 53, 53), (54, 54,
>>>>>>>>>>>>> 53), (61, 61, 58),
>>>>>>>>>>>>> (62, 62, 60), (55, 55, 54), (59, 59, 57), (56, 56,
>>>>>>>>>>>>> 55)]
>>>>>>>>>>>>> image = np.array(Image.open('mai.png').convert("RGB"))
>>>>>>>>>>>>> mask = np.isin(image, filter_colors, invert=True)
>>>>>>>>>>>>> img = Image.fromarray(mask.any(axis=2))
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>
>>>>>>>>>>>>>
>>>>>>>>>>>>> št 30. 12. 2021 o 18:14 Cyrus Yip <[email protected]>
>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>
>>>>>>>>>>>>>> I also tried many things like cropping, colour changing,
>>>>>>>>>>>>>> colour replacing, and mixing them together.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> I landed on checking if a pixel is not one of these:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> [(51, 51, 51), (69, 69, 65), (65, 64, 60), (59, 58, 56), (67,
>>>>>>>>>>>>>> 66, 62), (67, 67, 63), (67, 67, 62), (53, 53, 53), (54, 54, 53),
>>>>>>>>>>>>>> (61, 61,
>>>>>>>>>>>>>> 58), (62, 62, 60), (55, 55, 54), (59, 59, 57), (56, 56, 55)]
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> colours, replace it with white. It is pretty accurate but is
>>>>>>>>>>>>>> there a way to do this with numpy arrays?
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> (code)
>>>>>>>>>>>>>> for x in range(im.width):
>>>>>>>>>>>>>> if pixels[x, y] not in [(51, 51, 51), (69, 69, 65), (65,
>>>>>>>>>>>>>> 64, 60), (59, 58, 56), (67, 66, 62), (67, 67, 63), (67, 67, 62),
>>>>>>>>>>>>>> (53, 53,
>>>>>>>>>>>>>> 53), (54, 54, 53), (61, 61, 58), (62, 62, 60), (55, 55, 54),
>>>>>>>>>>>>>> (59, 59, 57),
>>>>>>>>>>>>>> (56, 56, 55)]:
>>>>>>>>>>>>>> pixels[x, y] = (255, 255, 255)
>>>>>>>>>>>>>> On Thursday, December 30, 2021 at 8:46:51 AM UTC-8 zdenop
>>>>>>>>>>>>>> wrote:
>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> OK. I played a little bit ;-):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I tested the speed of your code with your image:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> import timeit
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> pil_color_replace = """
>>>>>>>>>>>>>>> from PIL import Image
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> im = Image.open('mai.png').convert("RGB")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> pixdata = im.load()
>>>>>>>>>>>>>>> for y in range(im.height):
>>>>>>>>>>>>>>> for x in range(im.width):
>>>>>>>>>>>>>>> if pixdata[x, y] != (51, 51, 51):
>>>>>>>>>>>>>>> pixdata[x, y] = (255, 255, 255)
>>>>>>>>>>>>>>> """
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> elapsed_time = timeit.timeit(pil_color_replace,
>>>>>>>>>>>>>>> number=100)/100
>>>>>>>>>>>>>>> print(f"duration: {elapsed_time:.4} seconds")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I got an average speed 0.08547 seconds on my computer.
>>>>>>>>>>>>>>> On internet I found the suggestion to use numpy for this and
>>>>>>>>>>>>>>> I finished with the following code:
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> np_color_replace_rgb = """
>>>>>>>>>>>>>>> import numpy as np
>>>>>>>>>>>>>>> from PIL import Image
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> data = np.array(Image.open('mai.png').convert("RGB"))
>>>>>>>>>>>>>>> mask = (data == [51, 51, 51]).all(-1)
>>>>>>>>>>>>>>> img = Image.fromarray(np.invert(mask))
>>>>>>>>>>>>>>> """
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> elapsed_time = timeit.timeit(np_color_replace_rgb,
>>>>>>>>>>>>>>> number=100)/100
>>>>>>>>>>>>>>> print(f"duration: {elapsed_time:.4} seconds")
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> I got an average speed 0.01774 seconds e.g. 4.8 faster than
>>>>>>>>>>>>>>> the PIL code.
>>>>>>>>>>>>>>> It is a little bit cheating as it does not replace colors -
>>>>>>>>>>>>>>> just take a mask of target color and return it as a binarized
>>>>>>>>>>>>>>> image, what
>>>>>>>>>>>>>>> is exactly what you need for OCR ;-)
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Also, I would like to point out that the result OCR output
>>>>>>>>>>>>>>> is not so perfect (compared to OCR of unmodified text areas),
>>>>>>>>>>>>>>> as this kind
>>>>>>>>>>>>>>> of binarization is very simple.
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>> št 30. 12. 2021 o 11:19 Zdenko Podobny <[email protected]>
>>>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Just made your tests ;-)
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> You can use tesserocr (maybe quite difficult installation
>>>>>>>>>>>>>>>> if you are on windows) instead of pytesseract (e.g. initialize
>>>>>>>>>>>>>>>> tesseract
>>>>>>>>>>>>>>>> API once and use is multiple times). But it does not provide
>>>>>>>>>>>>>>>> DICT output.
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> st 29. 12. 2021 o 21:18 Cyrus Yip <[email protected]>
>>>>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> but won't multiple ocr's and crops use a lot of time?
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> On Wednesday, December 29, 2021 at 10:15:26 AM UTC-8
>>>>>>>>>>>>>>>>> zdenop wrote:
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> IMO if the text is always in the same area, cropping and
>>>>>>>>>>>>>>>>>> OCR just that area will be faster.
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> st 29. 12. 2021 o 18:58 Cyrus Yip <[email protected]>
>>>>>>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> I played around a bit and replacing all colours except
>>>>>>>>>>>>>>>>>>> for text colour and it works pretty well!
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> The only thing is replacing colours with:
>>>>>>>>>>>>>>>>>>> im = im.convert("RGB")
>>>>>>>>>>>>>>>>>>> pixdata = im.load()
>>>>>>>>>>>>>>>>>>> for y in range(im.height):
>>>>>>>>>>>>>>>>>>> for x in range(im.width):
>>>>>>>>>>>>>>>>>>> if pixdata[x, y] != (51, 51, 51):
>>>>>>>>>>>>>>>>>>> pixdata[x, y] = (255, 255, 255)
>>>>>>>>>>>>>>>>>>> is a bit slow. Do you know a better way to replace
>>>>>>>>>>>>>>>>>>> pixels in python? I don't know if this is off topic.
>>>>>>>>>>>>>>>>>>> On Wednesday, December 29, 2021 at 9:46:13 AM UTC-8
>>>>>>>>>>>>>>>>>>> zdenop wrote:
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> If you properly crop text areas you get good output.
>>>>>>>>>>>>>>>>>>>> E.g.
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> [image: r_cropped.png]
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> > tesseract r_cropped.png - --dpi 300
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Rascal Does Not Dream
>>>>>>>>>>>>>>>>>>>> of Bunny Girl Senpai
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> Zdenko
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> st 29. 12. 2021 o 18:21 Cyrus Yip <[email protected]>
>>>>>>>>>>>>>>>>>>>> napísal(a):
>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> here is an example of an image i would like to use ocr
>>>>>>>>>>>>>>>>>>>>> on:
>>>>>>>>>>>>>>>>>>>>> [image: drop8.png]
>>>>>>>>>>>>>>>>>>>>> I would like the results to be like:
>>>>>>>>>>>>>>>>>>>>> ["Naruto Uzumaki Naruto", "Mai Sakurajima Rascal Does
>>>>>>>>>>>>>>>>>>>>> Not Dream of Bunny Girl Senpai", "Keqing Genshin Impact"]
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Right now I'm using
>>>>>>>>>>>>>>>>>>>>> region1 = im.crop((0, 55, im.width, 110))
>>>>>>>>>>>>>>>>>>>>> region2 = im.crop((0, 312, im.width, 360))
>>>>>>>>>>>>>>>>>>>>> image = Image.new("RGB", (im.width, region1.height +
>>>>>>>>>>>>>>>>>>>>> region2.height + 20))
>>>>>>>>>>>>>>>>>>>>> image.paste(region1)
>>>>>>>>>>>>>>>>>>>>> image.paste(region2, (0, region1.height + 20))
>>>>>>>>>>>>>>>>>>>>> results = pytesseract.image_to_data(image,
>>>>>>>>>>>>>>>>>>>>> output_type=pytesseract.Output.DICT)
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> the processed image looks like
>>>>>>>>>>>>>>>>>>>>> [image: hi.png]
>>>>>>>>>>>>>>>>>>>>> but getting results like:
>>>>>>>>>>>>>>>>>>>>> [' ',
>>>>>>>>>>>>>>>>>>>>> '»MaiSakurajima¥RascalDoesNotDreamofBunnyGirlSenpai',
>>>>>>>>>>>>>>>>>>>>> 'iGenshinImpact']
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> How do I optimize the image/configs so the ocr is more
>>>>>>>>>>>>>>>>>>>>> accurate?
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> Thank you.
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>>>> You received this message because you are subscribed
>>>>>>>>>>>>>>>>>>>>> to the Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving
>>>>>>>>>>>>>>>>>>>>> emails from it, send an email to
>>>>>>>>>>>>>>>>>>>>> [email protected].
>>>>>>>>>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/1a2fa0e4-b998-4931-ad7d-ae069a46568bn%40googlegroups.com
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/1a2fa0e4-b998-4931-ad7d-ae069a46568bn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>>>> You received this message because you are subscribed to
>>>>>>>>>>>>>>>>>>> the Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails
>>>>>>>>>>>>>>>>>>> from it, send an email to
>>>>>>>>>>>>>>>>>>> [email protected].
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/3c60a0fd-a213-4caa-8a0d-6888a116b08an%40googlegroups.com
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/3c60a0fd-a213-4caa-8a0d-6888a116b08an%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>>>>> You received this message because you are subscribed to
>>>>>>>>>>>>>>>>> the Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails
>>>>>>>>>>>>>>>>> from it, send an email to [email protected]
>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/8d80ed59-6163-48c9-adb8-975d8274a9adn%40googlegroups.com
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/8d80ed59-6163-48c9-adb8-975d8274a9adn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>>>>
>>>>>>>>>>>>>>>> --
>>>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>>>> Google Groups "tesseract-ocr" group.
>>>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>>>
>>>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/8749a458-6938-4894-aa67-804631b5139dn%40googlegroups.com
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/8749a458-6938-4894-aa67-804631b5139dn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>>>> .
>>>>>>>>>>>>>>
>>>>>>>>>>>>> --
>>>>>>>>>>>> You received this message because you are subscribed to the
>>>>>>>>>>>> Google Groups "tesseract-ocr" group.
>>>>>>>>>>>> To unsubscribe from this group and stop receiving emails from
>>>>>>>>>>>> it, send an email to [email protected].
>>>>>>>>>>>>
>>>>>>>>>>> To view this discussion on the web visit
>>>>>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/83f7473f-a2c5-4d5c-8a45-450cb9a630c1n%40googlegroups.com
>>>>>>>>>>>>
>>>>>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/83f7473f-a2c5-4d5c-8a45-450cb9a630c1n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>>>>>> .
>>>>>>>>>>>>
>>>>>>>>>>> --
>>>>>>>> You received this message because you are subscribed to the Google
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it,
>>>>>>>> send an email to [email protected].
>>>>>>>> To view this discussion on the web visit
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/c7626180-9bd7-4759-9f0e-df0b0697ab15n%40googlegroups.com
>>>>>>>>
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/c7626180-9bd7-4759-9f0e-df0b0697ab15n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>>
>>>>>>> --
>>>>> You received this message because you are subscribed to the Google
>>>>> Groups "tesseract-ocr" group.
>>>>> To unsubscribe from this group and stop receiving emails from it, send
>>>>> an email to [email protected].
>>>>>
>>>> To view this discussion on the web visit
>>>>> https://groups.google.com/d/msgid/tesseract-ocr/5891f832-b45d-4e24-bcc2-e45a0ed4bb38n%40googlegroups.com
>>>>>
>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/5891f832-b45d-4e24-bcc2-e45a0ed4bb38n%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>> .
>>>>>
>>>> --
>>> You received this message because you are subscribed to the Google
>>> Groups "tesseract-ocr" group.
>>> To unsubscribe from this group and stop receiving emails from it, send
>>> an email to [email protected].
>>>
>> To view this discussion on the web visit
>>> https://groups.google.com/d/msgid/tesseract-ocr/2109d002-62d8-4c93-a2de-e9585b277fabn%40googlegroups.com
>>>
>>> <https://groups.google.com/d/msgid/tesseract-ocr/2109d002-62d8-4c93-a2de-e9585b277fabn%40googlegroups.com?utm_medium=email&utm_source=footer>
>>> .
>>>
>>
--
You received this message because you are subscribed to the Google Groups
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email
to [email protected].
To view this discussion on the web visit
https://groups.google.com/d/msgid/tesseract-ocr/1013d21f-395b-47b8-a20f-f88bfd8aab2dn%40googlegroups.com.