amol- commented on a change in pull request #11008:
URL: https://github.com/apache/arrow/pull/11008#discussion_r699178589
##########
File path: python/pyarrow/_dataset.pyx
##########
@@ -1998,6 +1998,41 @@ cdef class PartitioningFactory(_Weakrefable):
cdef inline shared_ptr[CPartitioningFactory] unwrap(self):
return self.wrapped
+ @property
+ def type_name(self):
+ return frombytes(self.factory.type_name())
+
+ def create_with_schema(self, schema):
Review comment:
I changed `write_dataset` to accept `partitioning +
partitioning_flavor`, see
```
ds.write_dataset(table, tempdir, format='parquet',
partitioning=["b"], partitioning_flavor="hive")
```
from the test
So we are not using a factory anymore. I'll update the documentation if we
are ok with this as the final api we want users to rely on.
##########
File path: python/pyarrow/tests/test_dataset.py
##########
@@ -1577,6 +1577,38 @@ def
test_dictionary_partitioning_outer_nulls_raises(tempdir):
ds.write_dataset(table, tempdir, format='parquet', partitioning=part)
+def test_write_dataset_with_field_names(tempdir):
Review comment:
:+1: moved the tests
##########
File path: python/pyarrow/dataset.py
##########
@@ -788,7 +805,8 @@ def file_visitor(written_file):
if max_partitions is None:
max_partitions = 1024
- partitioning = _ensure_write_partitioning(partitioning)
+ partitioning = _ensure_write_partitioning(partitioning, schema=schema,
Review comment:
Should have addressed those cases and added tests for them
--
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.
To unsubscribe, e-mail: [email protected]
For queries about this service, please contact Infrastructure at:
[email protected]