Dear all,

after some playing with the concept we have developed a module for
implementing the functionality of Pipeline in more general contexts as
first introduced in a former thread ( https://mail.python.org/
pipermail/scikit-learn/2018-January/002158.html )

In order to expand the possibilities of Pipeline for non linearly
sequential workflows a graph like structure has been deployed while keeping
as much as possible the already known syntax we all love and honor:

X = pd.DataFrame(dict(X=[0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10]))
y = 2 * X
sc = MinMaxScaler()
lm = LinearRegression()
steps = [('scaler', sc),
         ('linear_model', lm)]
connections = {'scaler': dict(X='X'),
               'linear_model': dict(X=('scaler', 'predict'),
                                    y='y')}
pgraph = PipeGraph(steps=steps,
                   connections=connections,
                   use_for_fit='all',
                   use_for_predict='all')

As you can see the biggest difference for the final user is the dictionary
describing the connections.

Another major contribution for developers wanting to expand scikit learn is
a collection of adapters for scikit learn models in order to provide them a
common API irrespectively of whether they originally implemented predict,
transform or fit_predict as an atomic operation without predict. These
adapters accept as many positional or keyword parameters in their fit
predict methods through *pargs and **kwargs.

As general as PipeGraph is, it cannot work under the restrictions imposed
by GridSearchCV on the input parameters, namely X and y since PipeGraph can
accept as many input signals as needed. Thus, an adhoc GridSearchCv version
is also needed and we will provide a basic initial version in a later
version.

We need to write the documentation and we will propose it as a
contrib-project in a few days.

Best wishes,
Manuel Castejón-Limas
_______________________________________________
scikit-learn mailing list
scikit-learn@python.org
https://mail.python.org/mailman/listinfo/scikit-learn

Reply via email to