Repository: spark
Updated Branches:
  refs/heads/master 7afa912e7 -> d81c08bac


[SPARK-2130] End-user friendly String repr for StorageLevel in Python

JIRA issue https://issues.apache.org/jira/browse/SPARK-2130

This PR adds an end-user friendly String representation for StorageLevel
in Python, similar to ```StorageLevel.description``` in Scala.
```
>>> rdd = sc.parallelize([1,2])
>>> storage_level = rdd.getStorageLevel()
>>> storage_level
StorageLevel(False, False, False, False, 1)
>>> print(storage_level)
Serialized 1x Replicated
```

Author: Kan Zhang <kzh...@apache.org>

Closes #1096 from kanzhang/SPARK-2130 and squashes the following commits:

7c8b98b [Kan Zhang] [SPARK-2130] Prettier epydoc output
cc5bf45 [Kan Zhang] [SPARK-2130] End-user friendly String representation for 
StorageLevel in Python


Project: http://git-wip-us.apache.org/repos/asf/spark/repo
Commit: http://git-wip-us.apache.org/repos/asf/spark/commit/d81c08ba
Tree: http://git-wip-us.apache.org/repos/asf/spark/tree/d81c08ba
Diff: http://git-wip-us.apache.org/repos/asf/spark/diff/d81c08ba

Branch: refs/heads/master
Commit: d81c08bac9756045865ed6490252fbb3f7591142
Parents: 7afa912
Author: Kan Zhang <kzh...@apache.org>
Authored: Mon Jun 16 23:31:31 2014 -0700
Committer: Patrick Wendell <pwend...@gmail.com>
Committed: Mon Jun 16 23:31:31 2014 -0700

----------------------------------------------------------------------
 python/pyspark/rdd.py          | 3 +++
 python/pyspark/storagelevel.py | 9 +++++++++
 2 files changed, 12 insertions(+)
----------------------------------------------------------------------


http://git-wip-us.apache.org/repos/asf/spark/blob/d81c08ba/python/pyspark/rdd.py
----------------------------------------------------------------------
diff --git a/python/pyspark/rdd.py b/python/pyspark/rdd.py
index ddd2285..bb4d035 100644
--- a/python/pyspark/rdd.py
+++ b/python/pyspark/rdd.py
@@ -1448,9 +1448,12 @@ class RDD(object):
     def getStorageLevel(self):
         """
         Get the RDD's current storage level.
+
         >>> rdd1 = sc.parallelize([1,2])
         >>> rdd1.getStorageLevel()
         StorageLevel(False, False, False, False, 1)
+        >>> print(rdd1.getStorageLevel())
+        Serialized 1x Replicated
         """
         java_storage_level = self._jrdd.getStorageLevel()
         storage_level = StorageLevel(java_storage_level.useDisk(),

http://git-wip-us.apache.org/repos/asf/spark/blob/d81c08ba/python/pyspark/storagelevel.py
----------------------------------------------------------------------
diff --git a/python/pyspark/storagelevel.py b/python/pyspark/storagelevel.py
index 7b6660e..3a18ea5 100644
--- a/python/pyspark/storagelevel.py
+++ b/python/pyspark/storagelevel.py
@@ -36,6 +36,15 @@ class StorageLevel:
         return "StorageLevel(%s, %s, %s, %s, %s)" % (
             self.useDisk, self.useMemory, self.useOffHeap, self.deserialized, 
self.replication)
 
+    def __str__(self):
+        result = ""
+        result += "Disk " if self.useDisk else ""
+        result += "Memory " if self.useMemory else ""
+        result += "Tachyon " if self.useOffHeap else ""
+        result += "Deserialized " if self.deserialized else "Serialized "
+        result += "%sx Replicated" % self.replication
+        return result
+
 StorageLevel.DISK_ONLY = StorageLevel(True, False, False, False)
 StorageLevel.DISK_ONLY_2 = StorageLevel(True, False, False, False, 2)
 StorageLevel.MEMORY_ONLY = StorageLevel(False, True, False, True)

Reply via email to