Hi, Zoraida,

thanks for the follow up! I went with a short, custom ColumnSelector class, but 
the itemgetter is even nicer.

Best,
Sebastian

On Aug 21, 2014, at 2:57 PM, ZORAIDA HIDALGO SANCHEZ 
<zoraida.hidalgosanc...@telefonica.com> wrote:

> Sebastian,
> 
> a few days ago, I asked a very similar question and I got this link as a
> response:
> 
> https://github.com/scikit-learn/scikit-learn/issues/2034
> 
> 
> I think that you could try something similar.
> 
> 
> Best,
> 
> Zoraida.-
> 
> El 21/08/14 18:48, "Sebastian Okser" <seo...@utu.fi> escribió:
> 
>> I am trying to use the pipeline combined with a countvectorizer,
>> tfidftransformer and randomforest. However the output of the second step
>> is a sparse array and randomforest requires a dense one. How can I add a
>> step to allow for a conversion of the matrix from sparse to dense, using
>> something along the lines of data.toarray(). Additionally, I would like
>> to add some additional features to the dataset after the text has been
>> processed. How can I create a step for this (normally I could use
>> something like hstack)? My code is as follows:
>> 
>> pipeline = Pipeline([
>>   ('vect', CountVectorizer()),
>>   ('tfidf', TfidfTransformer()),
>>   ('clf', OneVsRestClassifier(SVC(probability=True))),
>> ])
>> I would like to adjust this somehow to the following:
>> 
>> pipeline = Pipeline([
>>   ('vect', CountVectorizer()),
>>   ('tfidf', TfidfTransformer()),
>>   ('change_to_dense', SOME HOW CHANGE TO DENSE),
>>   ('add_more_data', SOME HOW ADD FEATURES),
>>   ('clf', OneVsRestClassifier(SVC(probability=True))),
>> ])
>> 
>> My first dataset, lets call it data1 is just an array of sentences. Below
>> is an example:
>> 
>> data1 = ['This is the first sentence',
>>            'This is the second sentence',
>>            'This is the third sentence']
>> 
>> The second dataset is numerical data of the following form:
>> 
>> data2 = array([[0],
>>                    [1],
>>                    [0]])
>> 
>> 
>> Thanks!
>> --------------------------------------------------------------------------
>> ----
>> Slashdot TV.
>> Video for Nerds.  Stuff that matters.
>> http://tv.slashdot.org/
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
> 
> 
> ________________________________
> 
> Este mensaje y sus adjuntos se dirigen exclusivamente a su destinatario, 
> puede contener información privilegiada o confidencial y es para uso 
> exclusivo de la persona o entidad de destino. Si no es usted. el destinatario 
> indicado, queda notificado de que la lectura, utilización, divulgación y/o 
> copia sin autorización puede estar prohibida en virtud de la legislación 
> vigente. Si ha recibido este mensaje por error, le rogamos que nos lo 
> comunique inmediatamente por esta misma vía y proceda a su destrucción.
> 
> The information contained in this transmission is privileged and confidential 
> information intended only for the use of the individual or entity named 
> above. If the reader of this message is not the intended recipient, you are 
> hereby notified that any dissemination, distribution or copying of this 
> communication is strictly prohibited. If you have received this transmission 
> in error, do not read it. Please immediately reply to the sender that you 
> have received this communication in error and then delete it.
> 
> Esta mensagem e seus anexos se dirigem exclusivamente ao seu destinatário, 
> pode conter informação privilegiada ou confidencial e é para uso exclusivo da 
> pessoa ou entidade de destino. Se não é vossa senhoria o destinatário 
> indicado, fica notificado de que a leitura, utilização, divulgação e/ou cópia 
> sem autorização pode estar proibida em virtude da legislação vigente. Se 
> recebeu esta mensagem por erro, rogamos-lhe que nos o comunique imediatamente 
> por esta mesma via e proceda a sua destruição
> 
> ------------------------------------------------------------------------------
> Slashdot TV.  
> Video for Nerds.  Stuff that matters.
> http://tv.slashdot.org/
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general


------------------------------------------------------------------------------
Slashdot TV.  
Video for Nerds.  Stuff that matters.
http://tv.slashdot.org/
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to