jorisvandenbossche commented on pull request #9677: URL: https://github.com/apache/arrow/pull/9677#issuecomment-797737946
> We could; we'd have to do that recursively, right? In case of a nested dictionary. (…though is that handled anyways?) I don't think we can parse nested types from the file paths? In that case, we wouldn't need to check it recursively. From a user point of view, having to specify `dictionaries="infer"` feels superfluous, as it is clear that's needed (but to be clear, this PR is already a nice improvement compared to the current situation! ;)) > It also doesn't help the fact that we need a Partitioning, not a PartitioningFactory, when we want to write data, so the auto-detection might be a little too magical… Hmm, yes, that complicates things. When writing, you don't need to specify the dictionaries. But indeed you still need the actual Partitioning and not the factory. So returning the factory *if* the schemas has a dictionary type and no dictionaries are passed, would then fail when writing .. The current API mixing both for reading/writing and the full object / the factory makes it a bit complex .. ---------------------------------------------------------------- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. For queries about this service, please contact Infrastructure at: [email protected]
