Em, Typically in machine learning a feature vector is just a vector of numbers which describes the data.
For example, if you are trying to classify images, the features might be a vector of pixel intensities. Or you could process the image to extract higher level features. For example, you might compute some basic statistics of the pixel intensities for each image (e.g, the mean, max, min, etc...) and then use those summary statistics as the features for each image. So in your case if you use key and value as the features then you have a 2-d feature vector. Can you describe your data a little more? J On Sun, 2011-05-22 at 05:56 -0700, Em wrote: > Hi list, > > I just read Mahout in Action and I tried to understand the chapter about > classifying data. > While I am reimplementing one of the examples from the book, I get really > confused and a little bit disappointed about the assumptions the author > makes about the reader. > > There are some lines of code where you can see a variable is in use but you > never saw where and how it was defined. > > So far, my question is: > > When using an OnlineLogisticRegression-Algorithm, what is ment by "feature"? > > Let's say I got a bunch of data in a csv-format. > There are the following columns I want to consider for classification: > "Key", "Value" - does it mean I got two features? > > Thanks, > Em > > -- > View this message in context: > http://lucene.472066.n3.nabble.com/Beginner-s-Question-What-is-a-feature-tp2971745p2971745.html > Sent from the Mahout User List mailing list archive at Nabble.com.
