Re: [PR] [SPARK-47014][PYTHON][CONNECT] Implement methods dumpPerfProfiles and dumpMemoryProfiles of SparkSession [spark]

via GitHub Tue, 13 Feb 2024 14:52:53 -0800


xinrong-meng commented on code in PR #45073:
URL: https://github.com/apache/spark/pull/45073#discussion_r1488687912



##########
python/pyspark/sql/profiler.py:
##########
@@ -158,6 +159,70 @@ def _profile_results(self) -> "ProfileResults":
         """
         ...
 
+    def dump_perf_profiles(self, path: str, id: Optional[int] = None) -> None:
+        """
+        Dump the perf profile results into directory `path`.
+
+        .. versionadded:: 4.0.0
+
+        Parameters
+        ----------
+        path: str
+            A directory in which to dump the perf profile.
+        id : int, optional
+            A UDF ID to be shown. If not specified, all the results will be 
shown.
+        """
+        with self._lock:
+            stats = self._perf_profile_results
+
+        def dump(path: str, id: int) -> None:
+            s = stats.get(id)
+
+            if s is not None:
+                if not os.path.exists(path):
+                    os.makedirs(path)
+                p = os.path.join(path, "udf_%d.pstats" % id)

Review Comment:
   Makes sense! I'll reuse `f"udf_{id}_memory.txt"` for memory profiles.



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]


---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Re: [PR] [SPARK-47014][PYTHON][CONNECT] Implement methods dumpPerfProfiles and dumpMemoryProfiles of SparkSession [spark]

Reply via email to