On Mon, Feb 4, 2013 at 8:47 PM, VIGNESH S <[email protected]> wrote:
> I am trying to classify mails > And how much training data do you have? How many labeled example emails which are _known_ to be of one class or another do you expect to have available for training? If this number is not terribly large, then training with Mahout may be non-optimal, and you should just go with WEKA, Mallet, LibSVM, etc. Most classifiers (from any library) will produce a small model file at the end of the day which could potentially be used on a mobile device. Alternatively, having your mobile app talk to a remote web service would allow you to trade local memory for remote method calls (presumably the email hits a webserver before it gets to the cell phone, doing classification there may be a better way to go). > > On Tue, Feb 5, 2013 at 12:10 AM, Jake Mannix <[email protected]> > wrote: > > On Mon, Feb 4, 2013 at 12:53 AM, VIGNESH S <[email protected]> > wrote: > > > >> Hi Jake, > >> > >> Thanks for your comments.. > >> > >> What is understood from your comment is Incase of Training,we can use > >> hadoop in clusters to generate the trained model blob file.. > >> > > > > Using a classifier has two steps: training the classifier (on data which > > has already been pre-labeled), and using the trained classifier on > > previously unlabeled items to predict the label which that item should > > have. > > > > Some Mahout algorithms can be trained on a Hadoop cluster, but in general > > are intended to be trained using large quantities of input data. The > > resultant serialized classifier file may be small, and this is what is > used > > in the second step, and could conceivably fit on a mobile device. > > > > > >> We can use that trained model blob file in mobile device for > >> classification.. > >> > >> is it possible to generate trained model blob file for all algorithms? > >> > > > > Not all algorithms even have simple single model files, no. > > > > What exactly are you trying to do on a mobile device? > > > > > >> > >> > >> > >> > >> On Sat, Feb 2, 2013 at 1:53 AM, Jake Mannix <[email protected]> > wrote: > >> > On Fri, Feb 1, 2013 at 7:19 AM, Chris Harrington <[email protected]> > >> wrote: > >> > > >> >> Kind of off topic but why Mahout and not Weka and why on a mobile > >> device. > >> >> > >> >> Mahout is built to be scalable for large datasets, not something > you'd > >> >> associate with a mobile device. > >> > > >> > > >> > Mahout scalability is about the *training set*. For example, you run > a > >> > webmail service, you have tons and tons of spam and not-spam emails. > You > >> > use Mahout to train a classifier on Hadoop using this training data, > at > >> the > >> > end of the day, you spit out a sparse classifier model file, which > could > >> > reasonably be a *very small* blob, under 100-1000KB. > >> > > >> > > >> >> On any mobile device you're going to run into memory issues very > quickly > >> >> with any sizable dataset. Even the Galaxy s3 only has max 256mb heap > >> >> allowed (i think). > >> >> > >> >> Personally I wouldn't even attempt such a thing, I'd off load the > heavy > >> >> lifting to a server and simply have the client mobile device request > >> >> whatever it needed. > >> >> > >> >> > >> >> On 1 Feb 2013, at 14:55, Jake Mannix wrote: > >> >> > >> >> > Hi Vignesh, > >> >> > > >> >> > You've got a lot of steps to go through before you can start > talking > >> >> > about putting it on your mobile device: you need to get your > training > >> >> > data, train your classifier offline using Mahout, write code in > your > >> >> mobile > >> >> > app which links to and uses the classifier package in Mahout that > will > >> >> > understand how to use the serialized classifier data file, then > make > >> sure > >> >> > your classifier data file is either bundled with your mobile app, > or > >> else > >> >> > downloads it when it needs it. > >> >> > > >> >> > So first, you need to train a classifier (check out Mahout In > Action > >> for > >> >> > more detailed instructions on this), it will result in a serialized > >> >> > classifier model on disk at the end of this process. > >> >> > > >> >> > > >> >> > On Thu, Jan 31, 2013 at 10:23 PM, VIGNESH S < > [email protected]> > >> >> wrote: > >> >> > > >> >> >> Hi , > >> >> >> > >> >> >> Thanks for the reply.. > >> >> >> > >> >> >> How can we make use of the training data done using Hadoop in > mobile > >> >> >> phones.. > >> >> >> > >> >> >> For Example,i can do some sort of serialization and store it on > disk > >> >> >> and deserialize in mobile and use that data.. > >> >> >> > >> >> >> is that possible or how can i use the training data without > >> connecting > >> >> >> to a hadoop cluster in real time.. > >> >> >> > >> >> >> > >> >> >> > >> >> >> Thanks and Regards > >> >> >> Vignesh Srinivasan > >> >> >> > >> >> >> > >> >> >> On Thu, Jan 31, 2013 at 7:43 AM, Jake Mannix < > [email protected]> > >> >> >> wrote: > >> >> >>> The *training* of many Mahout algorithms are on Hadoop, but the > >> output > >> >> >>> classifiers (e.g. a binary text classifier [trained with L1 > >> >> >> regularization > >> >> >>> to sparsify] for spam filtering) could certainly fit on a small > >> >> footprint > >> >> >>> like a mobile phone. > >> >> >>> > >> >> >>> > >> >> >>> On Wed, Jan 30, 2013 at 7:46 AM, Mahesh Balija > >> >> >>> <[email protected]>wrote: > >> >> >>> > >> >> >>>> AFAIK it is NOT possible. As Mahout runs on top of Hadoop. > >> >> >>>> Also Hadoop is a distributed computing framework, it will run on > >> >> >> cluster of > >> >> >>>> machines. > >> >> >>>> So ideally it may NOT be possible to run on a Mobile. > >> >> >>>> > >> >> >>>> On Wed, Jan 30, 2013 at 8:46 PM, VIGNESH S < > >> [email protected]> > >> >> >>>> wrote: > >> >> >>>> > >> >> >>>>> I am trying to implement some classification in android mobile > >> >> >> device.. > >> >> >>>>> > >> >> >>>>> is it possible to use mahout in mobile device..Please kindly > help > >> me > >> >> >>>>> > >> >> >>>>> -- > >> >> >>>>> Thanks and Regards > >> >> >>>>> Vignesh Srinivasan > >> >> >>>>> 9739135640 > >> >> >>>>> > >> >> >>>> > >> >> >>> > >> >> >>> > >> >> >>> > >> >> >>> -- > >> >> >>> > >> >> >>> -jake > >> >> >> > >> >> >> > >> >> >> > >> >> >> -- > >> >> >> Thanks and Regards > >> >> >> Vignesh Srinivasan > >> >> >> 9739135640 > >> >> >> > >> >> > > >> >> > > >> >> > > >> >> > -- > >> >> > > >> >> > -jake > >> >> > >> >> > >> > > >> > > >> > -- > >> > > >> > -jake > >> > >> > >> > >> -- > >> Thanks and Regards > >> Vignesh Srinivasan > >> 9739135640 > >> > > > > > > > > -- > > > > -jake > > > > -- > Thanks and Regards > Vignesh Srinivasan > 9739135640 > -- -jake
