HyukjinKwon commented on a change in pull request #28652:
URL: https://github.com/apache/spark/pull/28652#discussion_r430981667



##########
File path: python/pyspark/sql/dataframe.py
##########
@@ -2219,6 +2219,20 @@ def semanticHash(self):
         """
         return self._jdf.semanticHash()
 
+    @since(3.1)
+    def inputFiles(self):
+        """
+        Returns a best-effort snapshot of the files that compose this 
:class:`DataFrame`.
+        This method simply asks each constituent BaseRelation for its 
respective files and
+        takes the union of all results. Depending on the source relations, 
this may not find
+        all input files. Duplicates are removed.
+
+        >>> df = spark.read.load("examples/src/main/resources/people.json", 
format="json")
+        >>> len(df.inputFiles())
+        1
+        """
+        return [f for f in self._jdf.inputFiles()]

Review comment:
       You can just `return list(self._jdf.inputFiles())`




----------------------------------------------------------------
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

For queries about this service, please contact Infrastructure at:
[email protected]



---------------------------------------------------------------------
To unsubscribe, e-mail: [email protected]
For additional commands, e-mail: [email protected]

Reply via email to