GitHub user BryanCutler opened a pull request:
https://github.com/apache/spark/pull/9361
[SPARK-10158] [PySpark] [MLlib] ALS better error message when using Long IDs
Added catch for casting Long to Int exception when PySpark ALS Ratings are
serialized. It is easy to accidentally use Long IDs for user/product and
before, it would fail with a somewhat cryptic "ClassCastException:
java.lang.Long cannot be cast to java.lang.Integer." Now if this is done, a
more descriptive error is shown, e.g. "PickleException: Ratings id
1205640308657491975 exceeds max integer value of 2147483647."
You can merge this pull request into a Git repository by running:
$ git pull https://github.com/BryanCutler/spark
als-pyspark-long-id-error-SPARK-10158
Alternatively you can review and apply these changes as the patch at:
https://github.com/apache/spark/pull/9361.patch
To close this pull request, make a commit to your master/trunk branch
with (at least) the following in the commit message:
This closes #9361
----
commit b6b756adc04d996b66ac2555b311ed77bf6e7ec8
Author: Bryan Cutler <[email protected]>
Date: 2015-10-29T17:25:54Z
[SPARK-10158] Added check for PySpark Ratings with ids of Long values
commit fbea910bc6305639fe12595b3deddd4ecb5fa7dc
Author: Bryan Cutler <[email protected]>
Date: 2015-10-29T17:46:37Z
[SPARK-10158] Added PySpark ALS test case for using Ratings ids with Long
values
commit bc50d1020c53290cfa59da72c7448f976c56c1cd
Author: Bryan Cutler <[email protected]>
Date: 2015-10-29T18:40:56Z
[SPARK-10158] Improved test case to just use Pickler, no need to invoke
train
commit 45da6c852c81495458dc54fd412e279885eb06f6
Author: Bryan Cutler <[email protected]>
Date: 2015-10-29T20:35:05Z
Changed wording of exception message to include 'integer'
commit 51f2479f477f3abbd5f808c55d62ae9d4ebbb15c
Author: Bryan Cutler <[email protected]>
Date: 2015-10-29T20:35:30Z
Added positive test case for ALS Ratings serialize
----
---
If your project is set up for it, you can reply to this email and have your
reply appear on GitHub as well. If your project does not have this feature
enabled and wishes so, or if the feature is enabled but not working, please
contact infrastructure at [email protected] or file a JIRA ticket
with INFRA.
---
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]