Waleed Esmail created SPARK-23414:

             Summary: Plotting using matplotlib in MLlib pyspark 
                 Key: SPARK-23414
                 URL: https://issues.apache.org/jira/browse/SPARK-23414
             Project: Spark
          Issue Type: Question
          Components: MLlib
    Affects Versions: 2.2.1
            Reporter: Waleed Esmail

Dear MLlib experts,

I just want to plot a fancy confusion matrix (true values vs predicted values) 
like the one produced by seaborn module in python, so I did the following:
labelIndexer = StringIndexer(inputCol="label", 
# Automatically identify categorical features, and index them.
# We specify maxCategories so features with > 4 distinct values are treated as 
featureIndexer = VectorIndexer(inputCol="features", 

# Split the data into training and test sets (30% held out for testing)
(trainingData, testData) = output.randomSplit([0.7, 0.3])

dt = DecisionTreeClassifier(labelCol="indexedLabel", 
featuresCol="indexedFeatures", maxDepth=15)

# Chain indexers and tree in a Pipeline
pipeline = Pipeline(stages=[labelIndexer, featureIndexer, dt])
# Train model.  This also runs the indexers.
model = pipeline.fit(trainingData)

# Make predictions.
predictions = model.transform(testData)
predictionAndLabels = predictions.select("prediction", "indexedLabel")

y_predicted = np.array(predictions.select("prediction").collect())
y_test = np.array(predictions.select("indexedLabel").collect())

from sklearn.metrics import confusion_matrix
import matplotlib.ticker as ticker

figcm, ax = plt.subplots()
cm = confusion_matrix(y_test, y_predicted)
# for normalization
cm = cm.astype('float') / cm.sum(axis=1)[:, np.newaxis]
sns.heatmap(cm, square=True, annot=True, cbar=False)
plt.ylabel('true value')
is this the right way to do it?!. please note that I am new to Spark and MLlib


thank you in advance,

This message was sent by Atlassian JIRA

To unsubscribe, e-mail: issues-unsubscr...@spark.apache.org
For additional commands, e-mail: issues-h...@spark.apache.org

Reply via email to