Hi Team, Question 1: I would like to know if pyarrow has support for writing parquet files with run-length encoding? There is mention of this in the Python Docs under the compression section.
'can be compressed after the encoding passes (dictionary, RLE encoding)' https://arrow.apache.org/docs/python/parquet.html#compression-encoding-and-file-compatibility However I am not seeing the option in the API reference: https://arrow.apache.org/docs/python/generated/pyarrow.parquet.write_table.html#pyarrow.parquet.write_table I do note it's covered off in the C++ documentation, anyway we can access this in python? https://arrow.apache.org/docs/cpp/parquet.html Question 2: In addition to the above, I am interested to know if there are any methods to apply this type of encoding to data in transit over a network. Our actual use case has a large amount of data and would GREATLY benefit from run-length encoding due to the repetition (sensors not changing values that often). We are trying to send this data from a warehouse (the warehouse has not been selected as yet) to an application back end, which ultimately gets sent onto an application front end to visualise. Kind regards Nikhil Makan
