ryancasburn-KAI commented on issue #43442: URL: https://github.com/apache/arrow/issues/43442#issuecomment-2745248359
@alexshpilkin I did find a work around for that issue. `use_dictionary` can also take a list (of column names) in addition to a Boolean. So if you have: - ColumnA: RLE_DICTIONARY - ColumnB: DELTA_BINARY_PACKED - ColumnC: RLE_DICTIONARY - ColumnD: DELTA_BYTE_ARRAY - ColumnE: RLE_DICTIONARY You can do: `pq.write_table(table, where, use_dictionary=[“ColumnA”, “ColumnC”, “ColumnE”], column_encoding={“ColumnB”:”DELTA_BINARY_PACKED”, “ColumnD”: “DELTA_BYTE_ARRAY”})` This works, but: 1. It is kind of clunky. You now have to put all of your column names in your write table call. 2. It doesn’t have the flexibility of the fallback approach that is in the CPP documentation I quoted above. You have to be confident that the DELTA_BYTE_ARRAY (or whatever you select) is actually going to be better than dictionary (and better for all row groups). I think this area could use a re-work to improve usability -- This is an automated message from the Apache Git Service. To respond to the message, please log on to GitHub and use the URL above to go to the specific comment. To unsubscribe, e-mail: github-unsubscr...@arrow.apache.org For queries about this service, please contact Infrastructure at: us...@infra.apache.org