I too considered passing the estimator instance as a parameter to DBSCAN.
If we want to use KDTree or BallTree, NearestNeighbor instances created
with algorithm=kdtree or ball_tree can be passed. But Robert mentioned that
it would fail the unit test cases- the base test that ensures that all
BaseEstimators can be constructed requires that all class attributes be
set-able directly from init parameters. This ensures that it can be used
with pipelines.

How would we deal with this?




On Wed, Aug 6, 2014 at 11:22 AM, Joel Nothman <[email protected]>
wrote:

> It seems to me that the LSH forest is substituting for the `algorithm`
> parameter, which selects between ball_tree, kd_tree and brute search for
> nearest neighbour search. These are designed not to take additional
> parameters.
>
> So you need to accept additional parameters. You could indeed create
> another estimator like ApproximateNeighborsDBSCAN, but you'd need to do the
> same for KNeighborsClassifier, RadiusNeighborsClassifier,
> KNeighborsRegressor and RadiusNeighborsRegressor. That proliferation seems
> out of hand.
>
> Instead, could we have an interface in which the `algorithm` parameter
> could take any object supporting `fit(X)`, `query(X)` and
> `query_radius(X)`, such as an LSHForest instance? Indeed you could also
> make 'lsh' an available algorithm using reasonable parameters automatically
> inferred from the data, but you certainly want the user to be able to
> control the LSH parameters.
>
> (Note, currently BallTree, KDTree don't support fit(), and the index data
> is passed into their constructors.)
>
> There is also the caveat that currently the BallTree and KDTree are passed
> an effective_metric parameter in their constructors; not all metrics are
> possible with a particular LSH implementation, and only euclidean is
> currently supported. So the outer estimator (e.g. DBSCAN) could set an
> effective_metric parameter on the algorithm object, or it could not.
>
> WDYT?
>
>
> On 6 August 2014 15:33, Maheshakya Wijewardena <[email protected]>
> wrote:
>
>> Hi,
>>
>> I'm trying to use LSH Forest approximate neighbor search method to obtain
>> radius neighbors in DBSCAN. It adheres the API of sklearn.neighbors (at
>> least radius_neighbors method at this moment). But LSH Forest itself has a
>> set of parameters, so they need to be initialized.
>>
>> I'm thinking about passing an argumant to DBSCAN init method as
>> `approximate_neighbors=True` (or something suitable) and have the LSH
>> Forest parameters as well in DBSCAN init method.
>>
>> The other method Robert suggested to subclass from DBSCAN to use
>> approximate neighbors.
>>
>> Once LSH Forest is initialized, it's just a matter of applying that in
>> the place of `NearestNeighbors`. Are the above methods appropriate or is
>> there better ways?
>>
>> PR to LSH Forest: https://github.com/scikit-learn/scikit-learn/pull/3304
>>
>> Best Regards,
>> Maheshakya
>>
>> --
>> Undergraduate,
>> Department of Computer Science and Engineering,
>> Faculty of Engineering.
>> University of Moratuwa,
>> Sri Lanka
>>
>>
>> ------------------------------------------------------------------------------
>> Infragistics Professional
>> Build stunning WinForms apps today!
>> Reboot your WinForms applications with our WinForms controls.
>> Build a bridge from your legacy apps to the future.
>>
>> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
>> _______________________________________________
>> Scikit-learn-general mailing list
>> [email protected]
>> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>>
>>
>
>
> ------------------------------------------------------------------------------
> Infragistics Professional
> Build stunning WinForms apps today!
> Reboot your WinForms applications with our WinForms controls.
> Build a bridge from your legacy apps to the future.
>
> http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
> _______________________________________________
> Scikit-learn-general mailing list
> [email protected]
> https://lists.sourceforge.net/lists/listinfo/scikit-learn-general
>
>


-- 
Undergraduate,
Department of Computer Science and Engineering,
Faculty of Engineering.
University of Moratuwa,
Sri Lanka
------------------------------------------------------------------------------
Infragistics Professional
Build stunning WinForms apps today!
Reboot your WinForms applications with our WinForms controls. 
Build a bridge from your legacy apps to the future.
http://pubads.g.doubleclick.net/gampad/clk?id=153845071&iu=/4140/ostg.clktrk
_______________________________________________
Scikit-learn-general mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/scikit-learn-general

Reply via email to