Hyukjin Kwon created SPARK-19134:
------------------------------------
Summary: Fix several Python mllib and status api examples not
working
Key: SPARK-19134
URL: https://issues.apache.org/jira/browse/SPARK-19134
Project: Spark
Issue Type: Bug
Components: MLlib, PySpark
Reporter: Hyukjin Kwon
Priority: Minor
{code}
./bin/spark-submit
examples/src/main/python/mllib/binary_classification_metrics_example.py
{code}
{code}
File
".../spark/examples/src/main/python/mllib/binary_classification_metrics_example.py",
line 39, in <lambda>
.rdd.map(lambda row: LabeledPoint(row[0], row[1]))
File ".../spark/python/pyspark/mllib/regression.py", line 54, in __init__
self.features = _convert_to_vector(features)
File ".../spark/python/pyspark/mllib/linalg/__init__.py", line 80, in
_convert_to_vector
raise TypeError("Cannot convert type %s into Vector" % type(l))
TypeError: Cannot convert type <class 'pyspark.ml.linalg.SparseVector'> into
Vector
{code}
{code}
PYSPARK_PYTHON=python3 ./bin/spark-submit
examples/src/main/python/status_api_demo.py
{code}
{code}
Traceback (most recent call last):
File ".../spark/examples/src/main/python/status_api_demo.py", line 22, in
<module>
import Queue
ImportError: No module named 'Queue'
{code}
{code}
./bin/spark-submit examples/src/main/python/mllib/bisecting_k_means_example.py
{code}
{code}
Traceback (most recent call last):
File
"/Users/hyukjinkwon/Desktop/workspace/repos/forked/spark/examples/src/main/python/mllib/bisecting_k_means_example.py",
line 46, in <module>
model.save(sc, path)
AttributeError: 'BisectingKMeansModel' object has no attribute 'save'
{code}
{code}
./bin/spark-submit examples/src/main/python/mllib/elementwise_product_example.py
{code}
{code}
Traceback (most recent call last):
File
"/Users/hyukjinkwon/Desktop/workspace/repos/forked/spark/examples/src/main/python/mllib/elementwise_product_example.py",
line 48, in <module>
for each in transformedData2.collect():
File
"/Users/hyukjinkwon/Desktop/workspace/repos/forked/spark/python/pyspark/mllib/linalg/__init__.py",
line 478, in __getattr__
return getattr(self.array, item)
AttributeError: 'numpy.ndarray' object has no attribute 'collect'
{code}
--
This message was sent by Atlassian JIRA
(v6.3.4#6332)
---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]