its EncodedVectorsFromSequenceFiles.java I believe
------
Robin Anil

On Tue, Jul 31, 2012 at 6:05 PM, Eric Friedman <[email protected]>wrote:

> Can you point me to the class I should look at to see how this is done?
>
> On Tue, Jul 31, 2012 at 10:49 AM, Robin Anil <[email protected]> wrote:
> > You can pass in any vector(not just a tfidf vector). For example the
> > asf-email example script using Vectors generated using the randomized
> > encoding.
> > ------
> > Robin Anil
> >
> >
> > On Tue, Jul 31, 2012 at 12:26 PM, Sean Owen <[email protected]> wrote:
> >
> >> I don't know this code too much, but, there is simply a step in front
> >> I believe that vectorizes text with TF-IDF. The result are simple
> >> vectors. You could just inject your vectors (i.e. real-value
> >> attributes) at that stage and skip the TF-IDF. It may need a little
> >> hacking.
> >>
> >> On Tue, Jul 31, 2012 at 6:21 PM, Eric Friedman <[email protected]>
> >> wrote:
> >> > All of the examples that I've found for training NB classifiers seem
> >> > to have textual data as input.  Is there a way to build a classifier
> >> > with more general attributes?
> >> >
> >> > I found this jira ticket
> >> > (https://issues.apache.org/jira/browse/MAHOUT-286), but it's been
> >> > closed:duplicate under
> >> > https://issues.apache.org/jira/browse/MAHOUT-155, which doesn't seem
> >> > to address the underlying question.
> >> >
> >> > I know that I can do this with weka, but not at scale -- is mahout
> >> > only able to build textual classifiers?
> >> >
> >> > Thanks,
> >> > Eric
> >>
>

Reply via email to