GitHub user sethah opened a pull request:
https://github.com/apache/spark/pull/11663
[SPARK-13068][PYSPARK][ML] Type conversion for Pyspark params
## What changes were proposed in this pull request?
This patch adds type conversion functionality for parameters in Pyspark. A
`typeConverter` field is added to the constructor of `Param` class. This
argument is a function which converts values passed to this param to the
appropriate type if possible. This is beneficial so that the params can fail at
set time if they are given inappropriate values, but even more so because
coherent error messages are now provided when Py4J cannot cast the python type
to the appropriate Java type.
This patch also adds a `TypeConverters` class with factory methods for
common type conversions. Most of the changes involve adding these factory type
converters to existing params. The previous solution to this issue,
`expectedType`, is deprecated and can be removed in 2.1.0 as discussed on the
Jira.
## How was this patch tested?
Unit tests were added in python/pyspark/ml/tests.py to test parameter type
conversion. These tests check that values that should be convertible are
converted correctly, and that the appropriate errors are thrown when invalid
values are provided.
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/sethah/spark SPARK-13068-tc
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/11663.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #11663
----
commit bca206d0f9c3aeef6ce768e542f0eb4cbf9274ee
Author: sethah <[email protected]>
Date: 2016-03-10T23:25:39Z
type conversion for params
commit 798e4f89549e3fc74719359e769164d20dedefb1
Author: sethah <[email protected]>
Date: 2016-03-11T00:34:38Z
removing _convert method and using typeConverter directly in _set
commit 6e2399f1c69563a864ec1da4b111c3c6ce65200d
Author: sethah <[email protected]>
Date: 2016-03-11T16:50:39Z
docstring and typo
commit 257f46591d9198729bf2fb3e28186b9c96097e2d
Author: sethah <[email protected]>
Date: 2016-03-11T22:31:53Z
refactoring type conversions
commit 40b00dd2e3209fcf757ec3e3759ce4270096087e
Author: sethah <[email protected]>
Date: 2016-03-11T23:03:19Z
fixing deprecation message
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]