This is a pretty classic machine learning problem and can be handled with
several different algorithms.  Logistic regression is the obvious choice,
but clustering algorithms will work fine also.  Just decompose the pixels
into a really long vector and train your algorithm with the input-output
pairs.  You can get 100% accuracy on this pretty easily if you are careful
with your bias-variance decomposition.  This is a fun one for neural
networks too!

Essentially any machine learning book will delve into greater detail on
this as the US postal digit data has been around for a long time.  I think
Kaggle even had this as a training exercise for a while, so there's
probably a ton of discussion of various methods and algorithms on their
message boards.

For kicks why don't you compare k-means clustering to logistic regression
using Mahout?

-Angus




On Thu, Jan 23, 2014 at 8:00 PM, Chameera Wijebandara <
[email protected]> wrote:

> Hi,
>
> I am trying to classify handwritten digits using mahout classification. Any
> suggestion to come up with good solution?
>
> --
> Thanks,
>     Chameera
>

Reply via email to