[GitHub] [arrow] AlenkaF commented on a change in pull request #11724: ARROW-13781: [Python] Allow per column encoding in parquet writer

GitBox Mon, 22 Nov 2021 00:57:24 -0800


AlenkaF commented on a change in pull request #11724:
URL: https://github.com/apache/arrow/pull/11724#discussion_r754061348




##########
File path: python/pyarrow/_parquet.pyx
##########
@@ -1284,6 +1294,14 @@ cdef shared_ptr[WriterProperties] 
_create_writer_properties(
             props.encoding(tobytes(column),
                            ParquetEncoding_BYTE_STREAM_SPLIT)
 
+    # col_encoding
+    # encoding map - encode individual columns
+
+    if col_encoding is not None:
+        for column, _encoding in col_encoding.items():
+            props.encoding(tobytes(column),
+                           encoding_enum_from_name(_encoding))

Review comment:
       Yes, that is meant for number 1) in 
https://github.com/apache/arrow/pull/11724#discussion_r751195257 right? My 
question is regarding unknown column name, that is number 2) =)
   
   If for example we have a table with columns `a` and `b` and you specify the 
encoding for unknown column `c` - it does nothing and I am not sure how to 
raise an error in this case.
   
   That is if I understood number 2) correctly.




-- 
This is an automated message from the Apache Git Service.
To respond to the message, please log on to GitHub and use the
URL above to go to the specific comment.

To unsubscribe, e-mail: [email protected]

For queries about this service, please contact Infrastructure at:
[email protected]

[GitHub] [arrow] AlenkaF commented on a change in pull request #11724: ARROW-13781: [Python] Allow per column encoding in parquet writer

Reply via email to