[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

GitBox Thu, 15 Jul 2021 03:28:46 -0700


jorisvandenbossche commented on a change in pull request #10661:
URL: https://github.com/apache/arrow/pull/10661#discussion_r670338982




##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -2083,6 +2102,27 @@ cdef class DirectoryPartitioning(Partitioning):
         return PartitioningFactory.wrap(
             CDirectoryPartitioning.MakeFactory(c_field_names, c_options))
 
+    @property
+    def dictionaries(self):
+        """
+        The unique values for each partition field, if available.
+
+        Those values are only available if the Partitioning object was
+        created through dataset discovery from a PartitioningFactory, or
+        if the dictionaries were manually specified in the constructor.
+        If not available, this returns None.
+        """
+        cdef vector[shared_ptr[CArray]] c_arrays
+        c_arrays = self.directory_partitioning.dictionaries()
+        res = []
+        for arr in c_arrays:
+            if arr.get() == nullptr:
+                # Partitioning object has not been created through
+                # inspected Factory
+                return None
+            res.append(pyarrow_wrap_array(arr))

Review comment:
       `pyarrow_wrap_array` will in that case raise an error (it checks for 
`arr` being NULL), so at least it won't segfault I think




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] jorisvandenbossche commented on a change in pull request #10661: ARROW-8655: [C++][Python] Preserve partitioning information for a discovered Dataset

Reply via email to