Wondering if this issue was fixed in Tesseract 3.05.02. Any ideas?

On Friday, August 10, 2018 at 7:51:59 AM UTC-4, Mehul Bhardwaj wrote:
>
> Hi,
>
> I went through this discussion thread and updated to Tesseract 3.05.02. 
> Previously I was working with version 3.05. I was getting the same error of 
> "FAILURE: Couldn't find a matching blob" for about 15% of my training 
> characters. 
>
> But even after updating, I am still getting the exact same number of 
> errors as before.
>
> Could there be any other reason for this?
>
> I have about 174 training images, which are fairly identical in terms of 
> brightness, sharpness, background noise and have identical character 
> spacing, resolution.
> Out of 174 images, 48 images had no such error. 106 images had 5 or less 
> such errors. Each image has, on an average, 170 characters. So I am fairly 
> certain that the image type or other factors such as character size, 
> scaling, spacing has nothing to do with it.
>
> Any recommended tests to identify the issue will be very appreciated.
>
> Best Regards
> Mehul
>
> On Tuesday, June 5, 2018 at 9:23:16 PM UTC+5:30, Paul Kitchen wrote:
>>
>> Thank you for your help with these issues. The 3.05 branch now has all 
>> the issues fixed that I found.
>>
>> On Tuesday, June 5, 2018 at 8:59:08 AM UTC-6, zdenop wrote:
>>>
>>> Yes, it is ok, but you do not have to create separate issue for PR (PR 
>>> is a issue too)
>>>
>>> Zdenko
>>>
>>>
>>> ut 5. 6. 2018 o 16:52 Paul Kitchen <paul.k...@hexagonmetrology.com> 
>>> napísal(a):
>>>
>>>> ZDenko,
>>>>
>>>> I'm new to this so hopefully I did everything correctly. Here is the 
>>>> issue I created:
>>>>
>>>> https://github.com/tesseract-ocr/tesseract/issues/1636
>>>>
>>>> And here is the pull request:
>>>>
>>>> https://github.com/tesseract-ocr/tesseract/pull/1637
>>>>
>>>> On Tuesday, June 5, 2018 at 7:23:41 AM UTC-6, zdenop wrote:
>>>>>
>>>>> You need to fork official repository and then you have all permission 
>>>>> you need. When you make your changes you can send pull request to 
>>>>> official 
>>>>> repository with your changes.
>>>>>
>>>>> Zdenko
>>>>>
>>>>>
>>>>> ut 5. 6. 2018 o 15:06 Paul Kitchen <paul.k...@hexagonmetrology.com> 
>>>>> napísal(a):
>>>>>
>>>>>> ZDenko,
>>>>>>
>>>>>> Unfortunately I don't seem to have write permissions on the tesseract 
>>>>>> repo so I am unable to create a branch off of master to make the 
>>>>>> changes. 
>>>>>> Who do I need to lobby to get write permission?
>>>>>>
>>>>>> On Tuesday, June 5, 2018 at 3:00:23 AM UTC-6, zdenop wrote:
>>>>>>>
>>>>>>> Please make PR for master (4.0) branch and I will cherry-pick for 
>>>>>>> 3.05...
>>>>>>>
>>>>>>> Zdenko
>>>>>>>
>>>>>>>
>>>>>>> ut 5. 6. 2018 o 4:38 Paul Kitchen <paul.k...@hexagonmetrology.com> 
>>>>>>> napísal(a):
>>>>>>>
>>>>>>>> ZDenko,
>>>>>>>>
>>>>>>>> I checked out the latest tesseract code and updated to branch 3.05. 
>>>>>>>> I see that the int64_t area bug is already fixed (thanks!). I also see 
>>>>>>>> that 
>>>>>>>> the buffer read overrun is partially fixed. There is this line 
>>>>>>>> in ReadAllBoxes():
>>>>>>>>
>>>>>>>> box_data.push_back('\0');
>>>>>>>>
>>>>>>>> Since the memory will have to be deleted and reallocated, this will 
>>>>>>>> be quite inefficient. That is why I added this line to 
>>>>>>>> LoadDataFromFile():
>>>>>>>>
>>>>>>>> data->reserve(size + 1);
>>>>>>>>
>>>>>>>> I'm willing to make the change in a feature branch then create the 
>>>>>>>> pull request. I tried to create a branch in github but apparently I 
>>>>>>>> don't 
>>>>>>>> have branch creation privilege. I thought about forking but I'm not 
>>>>>>>> familiar with how that works, or if it would even be appropriate. Can 
>>>>>>>> you 
>>>>>>>> either make the change yourself or grant me branch creation privilege 
>>>>>>>> in 
>>>>>>>> the repo so I can make the change in a branch then create a pull 
>>>>>>>> request?
>>>>>>>>
>>>>>>>> By the way, I checked out master branch and it also has the same 
>>>>>>>> problem in LoadDataFromFile().
>>>>>>>>
>>>>>>>> -- 
>>>>>>>> You received this message because you are subscribed to the Google 
>>>>>>>> Groups "tesseract-ocr" group.
>>>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>>>> send an email to tesseract-oc...@googlegroups.com.
>>>>>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>>>> To view this discussion on the web visit 
>>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/307a7e38-bb5d-4870-ac12-29c735c3c9f8%40googlegroups.com
>>>>>>>>  
>>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/307a7e38-bb5d-4870-ac12-29c735c3c9f8%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>>>> .
>>>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>>>
>>>>>>> -- 
>>>>>> You received this message because you are subscribed to the Google 
>>>>>> Groups "tesseract-ocr" group.
>>>>>> To unsubscribe from this group and stop receiving emails from it, 
>>>>>> send an email to tesseract-oc...@googlegroups.com.
>>>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>>>> To view this discussion on the web visit 
>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/37ea9a46-ae6a-4782-b151-9edf90b6f532%40googlegroups.com
>>>>>>  
>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/37ea9a46-ae6a-4782-b151-9edf90b6f532%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>>>> .
>>>>>> For more options, visit https://groups.google.com/d/optout.
>>>>>>
>>>>> -- 
>>>> You received this message because you are subscribed to the Google 
>>>> Groups "tesseract-ocr" group.
>>>> To unsubscribe from this group and stop receiving emails from it, send 
>>>> an email to tesseract-oc...@googlegroups.com.
>>>> To post to this group, send email to tesser...@googlegroups.com.
>>>> Visit this group at https://groups.google.com/group/tesseract-ocr.
>>>> To view this discussion on the web visit 
>>>> https://groups.google.com/d/msgid/tesseract-ocr/c048b1a4-759e-4e88-8675-a73ef62b69e1%40googlegroups.com
>>>>  
>>>> <https://groups.google.com/d/msgid/tesseract-ocr/c048b1a4-759e-4e88-8675-a73ef62b69e1%40googlegroups.com?utm_medium=email&utm_source=footer>
>>>> .
>>>> For more options, visit https://groups.google.com/d/optout.
>>>>
>>>

-- 
You received this message because you are subscribed to the Google Groups 
"tesseract-ocr" group.
To unsubscribe from this group and stop receiving emails from it, send an email 
to tesseract-ocr+unsubscr...@googlegroups.com.
To post to this group, send email to tesseract-ocr@googlegroups.com.
Visit this group at https://groups.google.com/group/tesseract-ocr.
To view this discussion on the web visit 
https://groups.google.com/d/msgid/tesseract-ocr/ce6f6de7-aa19-4689-9138-8b5aa5749a6c%40googlegroups.com.
For more options, visit https://groups.google.com/d/optout.

Reply via email to