> My preferred way of implementing that would be a generic,
> stateless transformer class that just runs a function on X in
> transform and returns the result.

I think this is useful anyway, and an effective but not ideal solution
for this use-case. Here that makes a lot of overhead for what is
really a straightforward application.

On 28 February 2014 12:37, Michael Kneier <michael.kne...@gmail.com> wrote:
> Thanks for the great replies. As Lars rightly points out, I could define a
> custom transform to accomplish the combining.
>
> I do think that this could be more intuitively implemented (or at least
> built in to FeatureUnion), and I'd like pitch in on the
> https://github.com/scikit-learn/scikit-learn/issues/2034. I will take a
> closer look this weekend.
>
> Thanks,
> Mike
>
>
> On Thu, Feb 27, 2014 at 3:11 PM, Lars Buitinck <larsm...@gmail.com> wrote:
>>
>> 2014-02-27 23:37 GMT+01:00 Joel Nothman <joel.noth...@gmail.com>:
>> > I think it would be nice if the FeatureUnion makes it easy to extract
>> > only certain parts of the input for each transformer.
>> > https://github.com/scikit-learn/scikit-learn/issues/2034 intends to
>> > cover this issue, but we haven't resolved a clean API.
>> >
>> > Suggestions are welcome!
>>
>> I hope you don't mind me replying here: I think this can be resolved
>> by custom transformers that pass through a user-specified set of
>> columns. My preferred way of implementing that would be a generic,
>> stateless transformer class that just runs a function on X in
>> transform and returns the result. If this transformer doesn't do input
>> validation, you could make a union
>>
>>     make_pipeline(FunctionTransformer(extract_description_terms),
>> TfidfTransformer())
>>     ∪
>>     make_pipeline(FunctionTransformer(extract_portrait_pixels), PCA())
>>
>> and feed this filenames, or dicts, or whatever. The original problem
>> of letting though only some columns is then
>>
>>     def even_columns(X, *args):
>>         X = np.asarray(X)
>>         return X[:, ::2]
>>
>>     FunctionTransformer(even_columns)
>>
>> And of course, these things are more generally useful for inserting a
>> simple function in the middle of a pipeline.
>>
>>
>> ------------------------------------------------------------------------------
>> Flow-based real-time traffic analytics software. Cisco certified tool.
>> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
>> Customize your own dashboards, set traffic alerts and generate reports.
>> Network behavioral analysis & security monitoring. All-in-one tool.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> Scikit-learn-general@lists.sourceforge.net
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>
>
> ------------------------------------------------------------------------------
> Flow-based real-time traffic analytics software. Cisco certified tool.
> Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
> Customize your own dashboards, set traffic alerts and generate reports.
> Network behavioral analysis & security monitoring. All-in-one tool.
> http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> Scikit-learn-general@lists.sourceforge.net
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>

------------------------------------------------------------------------------
Flow-based real-time traffic analytics software. Cisco certified tool.
Monitor traffic, SLAs, QoS, Medianet, WAAS etc. with NetFlow Analyzer
Customize your own dashboards, set traffic alerts and generate reports.
Network behavioral analysis & security monitoring. All-in-one tool.
http://pubads.g.doubleclick.net/gampad/clk?id=126839071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
Scikit-learn-general@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to