Hello, I'd like to know whether there is a possibility in mahout to convert a byte file like the idx files of the mnist corpus ( http://yann.lecun.com/exdb/mnist/) to files containing mahout vectors, which i´d like to use for classification with rbms which I am writing now. Another thing I'd like to ask is what would be the best way to chunk these corpora in smaller batches and consuming them in hadoops map/reduce in the training phase, because I am pretty new to hadoop.
Thanks for the help Regards Dirk