Thanks Joel for recommending FeatureUnion. I did run across that. But for just 2 features, I thought that might be overkill. I am aware of Pipeline which the scikit-learn example explains very well, which I was going to utilize once I finalize my script. I did not want to abstract away too much early on since I am in the beginning stages of learning machine learning and scikit-learn.
- Daniel On Wed, Aug 2, 2017 at 10:38 PM, Joel Nothman <joel.noth...@gmail.com> wrote: > Use a Pipeline to help avoid this kind of issue (and others). You might > also want to do something like http://scikit-learn.org/ > stable/auto_examples/hetero_feature_union.html > > On 3 August 2017 at 12:01, pybokeh <pybo...@gmail.com> wrote: > >> Hello, >> I am studying this example from scikit-learn's site: >> http://scikit-learn.org/stable/tutorial/text_analytics/worki >> ng_with_text_data.html >> >> The problem that I need to solve is very similar to this example, except >> I have one >> additional feature column (part #) that is categorical of type string. >> My label or target >> values consist of just 2 values: 0 or 1. >> >> With that additional feature column, I am transforming it with a >> LabelEncoder and >> then I am encoding it with the OneHotEncoder. >> >> Then I am concatenating that one-hot encoded column (part #) to the >> text/document >> feature column (complaint), which I had applied the CountVectorizer and >> TfidfTransformer transformations. >> >> Then I chose the MultinomialNB model to fit my concatenated training data >> with. >> >> The problem I run into is when I invoke the prediction, I get a dimension >> mis-match error. >> >> Here's my jupyter notebook gist: >> http://nbviewer.jupyter.org/gist/anonymous/59ba930a783571c85 >> ef86ba41424b311 >> >> I would gladly appreciate it if someone can guide me where I went wrong. >> Thanks! >> >> - Daniel >> >> _______________________________________________ >> scikit-learn mailing list >> scikit-learn@python.org >> https://mail.python.org/mailman/listinfo/scikit-learn >> >> > > _______________________________________________ > scikit-learn mailing list > scikit-learn@python.org > https://mail.python.org/mailman/listinfo/scikit-learn > >
_______________________________________________ scikit-learn mailing list scikit-learn@python.org https://mail.python.org/mailman/listinfo/scikit-learn