houqp commented on issue #133: URL: https://github.com/apache/arrow-datafusion/issues/133#issuecomment-840044928
Hive partitioning is the most commonly used scheme, but there are other schemes as well, for example, the python arrow package supports both directory partitioning and hive partitioning: https://arrow.apache.org/docs/python/generated/pyarrow.dataset.partitioning.html?highlight=partition. I agree with @Dandandan that we should add the concept of partition column first, then tackle how we ser/de partition values from file paths. I can see us going the python arrow route as well, i.e. supporting multiple partitioning schemes. -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
