Hi, Zoraida, thanks for the follow up! I went with a short, custom ColumnSelector class, but the itemgetter is even nicer.
Best, Sebastian On Aug 21, 2014, at 2:57 PM, ZORAIDA HIDALGO SANCHEZ <zoraida.hidalgosanc...@telefonica.com> wrote: > Sebastian, > > a few days ago, I asked a very similar question and I got this link as a > response: > > https://github.com/scikit-learn/scikit-learn/issues/2034 > > > I think that you could try something similar. > > > Best, > > Zoraida.- > > El 21/08/14 18:48, "Sebastian Okser" <seo...@utu.fi> escribió: > >> I am trying to use the pipeline combined with a countvectorizer, >> tfidftransformer and randomforest. However the output of the second step >> is a sparse array and randomforest requires a dense one. How can I add a >> step to allow for a conversion of the matrix from sparse to dense, using >> something along the lines of data.toarray(). Additionally, I would like >> to add some additional features to the dataset after the text has been >> processed. How can I create a step for this (normally I could use >> something like hstack)? My code is as follows: >> >> pipeline = Pipeline([ >> ('vect', CountVectorizer()), >> ('tfidf', TfidfTransformer()), >> ('clf', OneVsRestClassifier(SVC(probability=True))), >> ]) >> I would like to adjust this somehow to the following: >> >> pipeline = Pipeline([ >> ('vect', CountVectorizer()), >> ('tfidf', TfidfTransformer()), >> ('change_to_dense', SOME HOW CHANGE TO DENSE), >> ('add_more_data', SOME HOW ADD FEATURES), >> ('clf', OneVsRestClassifier(SVC(probability=True))), >> ]) >> >> My first dataset, lets call it data1 is just an array of sentences. Below >> is an example: >> >> data1 = ['This is the first sentence', >> 'This is the second sentence', >> 'This is the third sentence'] >> >> The second dataset is numerical data of the following form: >> >> data2 = array([[0], >> [1], >> [0]]) >> >> >> Thanks! >> -------------------------------------------------------------------------- >> ---- >> Slashdot TV. >> Video for Nerds. Stuff that matters. >> http://tv.slashdot.org/ >> _______________________________________________ >> Scikit-learn-general mailing list >> Scikit-learn-general@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general > > > ________________________________ > > Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, > puede contener información privilegiada o confidencial y es para uso > exclusivo de la persona o entidad de destino. Si no es usted. el destinatario > indicado, queda notificado de que la lectura, utilización, divulgación y/o > copia sin autorización puede estar prohibida en virtud de la legislación > vigente. Si ha recibido este mensaje por error, le rogamos que nos lo > comunique inmediatamente por esta misma vía y proceda a su destrucción. > > The information contained in this transmission is privileged and confidential > information intended only for the use of the individual or entity named > above. If the reader of this message is not the intended recipient, you are > hereby notified that any dissemination, distribution or copying of this > communication is strictly prohibited. If you have received this transmission > in error, do not read it. Please immediately reply to the sender that you > have received this communication in error and then delete it. > > Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, > pode conter informação privilegiada ou confidencial e é para uso exclusivo da > pessoa ou entidade de destino. Se não é vossa senhoria o destinatário > indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia > sem autorização pode estar proibida em virtude da legislação vigente. Se > recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente > por esta mesma via e proceda a sua destruição > > ------------------------------------------------------------------------------ > Slashdot TV. > Video for Nerds. Stuff that matters. > http://tv.slashdot.org/ > _______________________________________________ > Scikit-learn-general mailing list > Scikit-learn-general@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/scikit-learn-general ------------------------------------------------------------------------------ Slashdot TV. Video for Nerds. Stuff that matters. http://tv.slashdot.org/ _______________________________________________ Scikit-learn-general mailing list Scikit-learn-general@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/scikit-learn-general