Hi Xin,
thanks for the interest in extending Parquet. I suppose this is only about the Parquet Writer/Reader implementation, not about changes to the Parquet specification. I would like to know whether offloading the task of compressing/decompressing some data is really beneficial performance wise. I suppose I don't understand how all of this would come together. Here are my points: - The accelerator might require to have the compressed data copied over to decompress it. This will only make compression/decompression slower since many of the supported codecs actually have quite fast parsers and decompressors. The accelerator would have to copy it back. - Even if it doesn't have to be copied over, I suppose this accelerator is connected over the PCI-E bus so reading chunks would be expensive. Also, many of those decompressors reference chunks observed previously and perform a memcpy. The accelerator implementation has to be smart about those things. - Many of the decompressors do some decoding and essentially perform a memcpy which makes them quite fast. - Can the supported codecs like zstd, lz4, etc run on those accelerators? Have you done some measurements? Kind regards, Martin ________________________________ From: Dong, Xin <[email protected]> Sent: Wednesday, March 4, 2020 1:46:29 AM To: [email protected] Subject: Provide pluggable APIs to support user customized compression codec Hi, In demand of better performance, quite some end users want to leverage accelerators (e.g. FPGA, Intel QAT) to offload compression computation. However, in current parquet-mr code, codec implementation can't be customized to leverage accelerators. We would like to proposal a pluggable API to support the customized compression codec. I've opened a JIRA https://issues.apache.org/jira/browse/PARQUET-1804 for this issue. What's your throughts on this issue? Best Regards, Xin Dong
