Re: [PR] GH-43684: [Python][Dataset] Python / Cython interface to C++ arrow::dataset::Partitioning::Format [arrow]

via GitHub Mon, 26 Aug 2024 02:36:38 -0700


mapleFU commented on code in PR #43740:
URL: https://github.com/apache/arrow/pull/43740#discussion_r1730981644



##########
python/pyarrow/_dataset.pyx:
##########
@@ -2505,6 +2505,43 @@ cdef class Partitioning(_Weakrefable):
         result = self.partitioning.Parse(tobytes(path))
         return Expression.wrap(GetResultValue(result))
 
+    def format(self, expr):
+        """
+        Convert a filter expression into a tuple of (directory, filename) 
using 
+        the current partitioning scheme
+
+        Parameters
+        ----------
+        expr : pyarrow.dataset.Expression
+
+        Returns
+        -------
+        tuple[str, str]
+
+        Examples
+        --------
+
+        Specify the Schema for paths like "/2009/June":
+
+        >>> import pyarrow as pa
+        >>> import pyarrow.dataset as ds
+        >>> import pyarrow.compute as pc
+        >>> part = ds.partitioning(pa.schema([("year", pa.int16()),
+        ...                                   ("month", pa.string())]))
+        >>> part.format(
+        ...     (pc.field("year") == 1862) & (pc.field("month") == "Jan")
+        ... )
+        """
+        cdef:
+            CResult[CPartitionPathFormat] result

Review Comment:
   @pitrou I'm not familiar with CPython, and I didn't find the answer, do you 
think this would be cost, or this separate assign could be optimized out by 
compiler?



-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

Re: [PR] GH-43684: [Python][Dataset] Python / Cython interface to C++ arrow::dataset::Partitioning::Format [arrow]

Reply via email to