Hi Grant, For Bayes input is a tab separated flat files. with each document is in a line. Label as the first word followed by a tab and followed by the flattened document. I will be travelling the next 3 days, as I am relocating to my Job location. So I hope i will be able to give you the documentation of the same by Monday morning.
Robin On Thu, Jul 16, 2009 at 1:02 AM, Grant Ingersoll <[email protected]>wrote: > Hi Robin, > > I have been looking a bit at the classification stuff a bit more and am > wondering if we should be switching to use Vectors now, since the name could > be the label and the value can contain weights, similar to what we do for > clustering. > > Also, I was wondering if you could document the format used for the input > files now and the steps taken by the algorithms. I'm trying to better > understand the Wikipedia examples and also the HBase. > > -Grant > > > >
