Wondering if this issue was fixed in Tesseract 3.05.02. Any ideas? On Friday, August 10, 2018 at 7:51:59 AM UTC-4, Mehul Bhardwaj wrote: > > Hi, > > I went through this discussion thread and updated to Tesseract 3.05.02. > Previously I was working with version 3.05. I was getting the same error of > "FAILURE: Couldn't find a matching blob" for about 15% of my training > characters. > > But even after updating, I am still getting the exact same number of > errors as before. > > Could there be any other reason for this? > > I have about 174 training images, which are fairly identical in terms of > brightness, sharpness, background noise and have identical character > spacing, resolution. > Out of 174 images, 48 images had no such error. 106 images had 5 or less > such errors. Each image has, on an average, 170 characters. So I am fairly > certain that the image type or other factors such as character size, > scaling, spacing has nothing to do with it. > > Any recommended tests to identify the issue will be very appreciated. > > Best Regards > Mehul > > On Tuesday, June 5, 2018 at 9:23:16 PM UTC+5:30, Paul Kitchen wrote: >> >> Thank you for your help with these issues. The 3.05 branch now has all >> the issues fixed that I found. >> >> On Tuesday, June 5, 2018 at 8:59:08 AM UTC-6, zdenop wrote: >>> >>> Yes, it is ok, but you do not have to create separate issue for PR (PR >>> is a issue too) >>> >>> Zdenko >>> >>> >>> ut 5. 6. 2018 o 16:52 Paul Kitchen <paul.k...@hexagonmetrology.com> >>> napísal(a): >>> >>>> ZDenko, >>>> >>>> I'm new to this so hopefully I did everything correctly. Here is the >>>> issue I created: >>>> >>>> https://github.com/tesseract-ocr/tesseract/issues/1636 >>>> >>>> And here is the pull request: >>>> >>>> https://github.com/tesseract-ocr/tesseract/pull/1637 >>>> >>>> On Tuesday, June 5, 2018 at 7:23:41 AM UTC-6, zdenop wrote: >>>>> >>>>> You need to fork official repository and then you have all permission >>>>> you need. When you make your changes you can send pull request to >>>>> official >>>>> repository with your changes. >>>>> >>>>> Zdenko >>>>> >>>>> >>>>> ut 5. 6. 2018 o 15:06 Paul Kitchen <paul.k...@hexagonmetrology.com> >>>>> napísal(a): >>>>> >>>>>> ZDenko, >>>>>> >>>>>> Unfortunately I don't seem to have write permissions on the tesseract >>>>>> repo so I am unable to create a branch off of master to make the >>>>>> changes. >>>>>> Who do I need to lobby to get write permission? >>>>>> >>>>>> On Tuesday, June 5, 2018 at 3:00:23 AM UTC-6, zdenop wrote: >>>>>>> >>>>>>> Please make PR for master (4.0) branch and I will cherry-pick for >>>>>>> 3.05... >>>>>>> >>>>>>> Zdenko >>>>>>> >>>>>>> >>>>>>> ut 5. 6. 2018 o 4:38 Paul Kitchen <paul.k...@hexagonmetrology.com> >>>>>>> napísal(a): >>>>>>> >>>>>>>> ZDenko, >>>>>>>> >>>>>>>> I checked out the latest tesseract code and updated to branch 3.05. >>>>>>>> I see that the int64_t area bug is already fixed (thanks!). I also see >>>>>>>> that >>>>>>>> the buffer read overrun is partially fixed. There is this line >>>>>>>> in ReadAllBoxes(): >>>>>>>> >>>>>>>> box_data.push_back('\0'); >>>>>>>> >>>>>>>> Since the memory will have to be deleted and reallocated, this will >>>>>>>> be quite inefficient. That is why I added this line to >>>>>>>> LoadDataFromFile(): >>>>>>>> >>>>>>>> data->reserve(size + 1); >>>>>>>> >>>>>>>> I'm willing to make the change in a feature branch then create the >>>>>>>> pull request. I tried to create a branch in github but apparently I >>>>>>>> don't >>>>>>>> have branch creation privilege. I thought about forking but I'm not >>>>>>>> familiar with how that works, or if it would even be appropriate. Can >>>>>>>> you >>>>>>>> either make the change yourself or grant me branch creation privilege >>>>>>>> in >>>>>>>> the repo so I can make the change in a branch then create a pull >>>>>>>> request? >>>>>>>> >>>>>>>> By the way, I checked out master branch and it also has the same >>>>>>>> problem in LoadDataFromFile(). >>>>>>>> >>>>>>>> -- >>>>>>>> You received this message because you are subscribed to the Google >>>>>>>> Groups "tesseract-ocr" group. >>>>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>>>> To view this discussion on the web visit >>>>>>>> https://groups.google.com/d/msgid/tesseract-ocr/307a7e38-bb5d-4870-ac12-29c735c3c9f8%40googlegroups.com >>>>>>>> >>>>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/307a7e38-bb5d-4870-ac12-29c735c3c9f8%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>>>> . >>>>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>>>> >>>>>>> -- >>>>>> You received this message because you are subscribed to the Google >>>>>> Groups "tesseract-ocr" group. >>>>>> To unsubscribe from this group and stop receiving emails from it, >>>>>> send an email to tesseract-oc...@googlegroups.com. >>>>>> To post to this group, send email to tesser...@googlegroups.com. >>>>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>>>> To view this discussion on the web visit >>>>>> https://groups.google.com/d/msgid/tesseract-ocr/37ea9a46-ae6a-4782-b151-9edf90b6f532%40googlegroups.com >>>>>> >>>>>> <https://groups.google.com/d/msgid/tesseract-ocr/37ea9a46-ae6a-4782-b151-9edf90b6f532%40googlegroups.com?utm_medium=email&utm_source=footer> >>>>>> . >>>>>> For more options, visit https://groups.google.com/d/optout. >>>>>> >>>>> -- >>>> You received this message because you are subscribed to the Google >>>> Groups "tesseract-ocr" group. >>>> To unsubscribe from this group and stop receiving emails from it, send >>>> an email to tesseract-oc...@googlegroups.com. >>>> To post to this group, send email to tesser...@googlegroups.com. >>>> Visit this group at https://groups.google.com/group/tesseract-ocr. >>>> To view this discussion on the web visit >>>> https://groups.google.com/d/msgid/tesseract-ocr/c048b1a4-759e-4e88-8675-a73ef62b69e1%40googlegroups.com >>>> >>>> <https://groups.google.com/d/msgid/tesseract-ocr/c048b1a4-759e-4e88-8675-a73ef62b69e1%40googlegroups.com?utm_medium=email&utm_source=footer> >>>> . >>>> For more options, visit https://groups.google.com/d/optout. >>>> >>>
-- You received this message because you are subscribed to the Google Groups "tesseract-ocr" group. To unsubscribe from this group and stop receiving emails from it, send an email to tesseract-ocr+unsubscr...@googlegroups.com. To post to this group, send email to tesseract-ocr@googlegroups.com. Visit this group at https://groups.google.com/group/tesseract-ocr. To view this discussion on the web visit https://groups.google.com/d/msgid/tesseract-ocr/ce6f6de7-aa19-4689-9138-8b5aa5749a6c%40googlegroups.com. For more options, visit https://groups.google.com/d/optout.