I am ready to move FeatureVectorEncoders under vectorizer.encoder.* . I
wanted to move them under vectorizer but it seems these are homogenous and
should be kept separate from jobs and other classes.

I need to create a Dictionary Based FeatureEncoder for that I am thinking
about the following.

I will be renaming FeatureVectorEncoder as ProbedFeatureVectorEncoder
abstract class
This abstract class will extend a FeatureEncoder interface having two
functions int encode(String) and int encode(byte[])

I will implement this interface in two FeatureEncoders: TFTextEncoder and
TFIDFTextEncoder

Ted, and others,  how does this sound?

On Mon, Oct 4, 2010 at 6:30 AM, Grant Ingersoll <[email protected]> wrote:

>
> On Oct 2, 2010, at 2:23 PM, Robin Anil wrote:
>
> > How do you feel like moving the DictionaryVectorizer and Colloc generator
> to
> > the Core under vectorizers package instead of keeping them under utils.
> > FeatureEncoders will also be moved under vectorizers. I want to add a
> > Wrapper which takes a Vectorizer and converts input data to vectors. Its
> the
> > missing piece of the Classifier puzzle
>
> Why does that require Vectorizer be moved to Core?  Vectorizer seems like a
> util to me.
>
> >
> > o.a.m.vectorizer.dictionary
> > o.a.m.vectorizer.hashed or something funkier?
> >
> > What do you think about this?
> >
> > Robin
>
> --------------------------
> Grant Ingersoll
> http://lucenerevolution.org Apache Lucene/Solr Conference, Boston Oct 7-8
>
>

Reply via email to