Hi,

thanks for your reply.

1. I tested about 100 samples with sklearn. In my example there was only
one sample because of readability and simplicity.

In short: I read image with opencv, then detect a region of interest and
extract digits through contouring. These are machine written digits, but
can have slight perspective distortion. Digits are around 20x25px on 72dpi
jpeg. So these are far from optimal for OCR, but nonetheless tesseract does
good job. Main reason for failure are perspective distortion (can be
de-warped with coding, but I hope not to go there) and discontinuities in
digit paths that sometimes emerge as I clean the image with morphological
operators. I have roughly 90% success with tesseract and less then 50% with
sklearn digits dataset, considering I further degrade image to fit 6x8
shape which trained sklearn algorithm expects. Errors with sklearn are
minimal in a sense that I always get result "5" for example while I feed
predictor with image representing digit "6", and similar. When sklearn
fails, I get as a result "1" and this is not so common.

2. No, I just used the code from sklearn documentation without further
tweaking, as machine learning algorithms and most concepts are foreign to
me. I mean I know the very basics, but for example I don't know what are
hyper parameters, and will investigate later today.

3. Thanks for the link, it looks very promising (even the digits dimensions
used). I'll follow the example and report back.


Regards,
klo



On Sat, May 24, 2014 at 5:23 AM, Caleb wrote:

> Hi,
>
> I am curious about few things:
>
> 1. what are the samples you use for testing your classifier? merely one
> sample is hard to do justice for its accuracy.
>
> 2. did you try to fine tune the hyper parameters for your svm?
>
> 3. you might be interested in this blog post, the author get a very
> impressive result
> http://peekaboo-vision.blogspot.de/2010/09/mnist-for-ever.html
>
> regards,
> Caleb
>
>
------------------------------------------------------------------------------
"Accelerate Dev Cycles with Automated Cross-Browser Testing - For FREE
Instantly run your Selenium tests across 300+ browser/OS combos.
Get unparalleled scalability from the best Selenium testing platform available
Simple to use. Nothing to install. Get started now for free."
http://p.sf.net/sfu/SauceLabs
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to