mapleFU commented on code in PR #43740:
URL: https://github.com/apache/arrow/pull/43740#discussion_r1730981644
##########
python/pyarrow/_dataset.pyx:
##########
@@ -2505,6 +2505,43 @@ cdef class Partitioning(_Weakrefable):
result = self.partitioning.Parse(tobytes(path))
return Expression.wrap(GetResultValue(result))
+ def format(self, expr):
+ """
+ Convert a filter expression into a tuple of (directory, filename)
using
+ the current partitioning scheme
+
+ Parameters
+ ----------
+ expr : pyarrow.dataset.Expression
+
+ Returns
+ -------
+ tuple[str, str]
+
+ Examples
+ --------
+
+ Specify the Schema for paths like "/2009/June":
+
+ >>> import pyarrow as pa
+ >>> import pyarrow.dataset as ds
+ >>> import pyarrow.compute as pc
+ >>> part = ds.partitioning(pa.schema([("year", pa.int16()),
+ ... ("month", pa.string())]))
+ >>> part.format(
+ ... (pc.field("year") == 1862) & (pc.field("month") == "Jan")
+ ... )
+ """
+ cdef:
+ CResult[CPartitionPathFormat] result
Review Comment:
@pitrou I'm not familiar with CPython, and I didn't find the answer, do you
think this would be cost, or this separate assign could be optimized out by
compiler?
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]