Joseph K. Bradley created SPARK-5981:
----------------------------------------
Summary: pyspark ML models fail during predict/transform on vector
within map
Key: SPARK-5981
URL: https://issues.apache.org/jira/browse/SPARK-5981
Project: Spark
Issue Type: Bug
Components: MLlib, PySpark
Affects Versions: 1.3.0
Reporter: Joseph K. Bradley
Priority: Critical
Many Python ML models and transformers use JavaModelWrapper to call methods in
the JVM, such as predict() and transform(). It is common to write:
{code}
data.map(lambda features: model.predict(features))
{code}
This fails because JavaModelWrapper.call uses the SparkContext (within the
transformation).
Note: It is possible to do a workaround using batch predict if
models/transformers support it:
{code}
model.predict(data)
{code}
However, this is still a major problem.
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]