Hi,

I am having twitter data in a single txt file as:
@VancityBeerGuy - RT @BCBerrie: well @VancityBeerGuy you know what they say 
about guys with #smallenfreuden right? Hahaha Created At:Mon Jun 03 07:18:46 
IST 2013
@IanSylves - RT @PTorgo91: @otterN9NE you're the best thing to happen to the 
#sabres since Drury #lordstanley#nextyear #smallenfreuden Created At:Mon Jun 03 
07:18:37 IST 2013
@LiLItalyPasta - RT @LamyaAsiff: #smallenfreuden is #stupidfreuden. Created 
At:Mon Jun 03 07:17:36 IST 2013
@MMBris - RT @jaimestein: Whenever you find yourself on the side of the 
majority, it is time to pause and #smallenfreuden. -Mark Twain Created At:Mon 
Jun 03 07:16:43 IST 2013
@SeanBickerton - RT @kbieksa3: Big save by Bernier to keep it somewhat close. 
Leave it to a french guy to get the boys going... @aburr14 #Smallenfreuden 
Created At:Mon Jun 03 07:16:41 IST 2013

I need to generate vectors for KMeans clustering from this txt file using java.

I need help to select the features.

Lines from Mahout in Action:

The process of selecting the features of an object and mapping them to numbers 
is
known as feature selection. The process of encoding features as a vector is 
vectorization.


Thanks
-N

Reply via email to