Re: [External] Current state of parquet zstd OOM with hudi

2023-12-03 Thread Danny Chan
> I would say an entry in the hudi FAQ on this issue would be great, since hard > to spot, and marked as fixed on spark side. Makes sense, welcome to fire a fix to Hudi website. Best, Danny Nicolas Paris 于2023年11月22日周三 15:55写道: > > We fixed the hudi memory leak by patching parquet 1.12 and

Re: [External] Current state of parquet zstd OOM with hudi

2023-11-21 Thread Nicolas Paris
We fixed the hudi memory leak by patching parquet 1.12 and rely on gradle to overwrite the transitive dependencies of parquet with that latest version. I would say an entry in the hudi FAQ on this issue would be great, since hard to spot, and marked as fixed on spark side. Also we didn't

Re: [External] Current state of parquet zstd OOM with hudi

2023-11-20 Thread Nicolas Paris
Following up on this, only spark 3.5.x ships with fixed parquet version 0.13.x. It's available for latest hudi 0.14 only. If i replace parquet in previous version of spark i likely breaks the reader/writers since methods have been changed in parquet. Right now I will experiment with 3.5 and  

Re: [External] Current state of parquet zstd OOM with hudi

2023-11-20 Thread Nicolas Paris
gt; version and check if it is fixed w/O change anything in hudi > From: "nicolas paris" > Date: Mon, Nov 20, 2023, 20:07 > Subject: [External] Current state of parquet zstd OOM with hudi > To: "Hudi Dev List" > hey month ago someone spotted memory leak while reading

Re: [External] Current state of parquet zstd OOM with hudi

2023-11-20 Thread 管梓越
on the parquet interface used by hudi. You can simply upgrade spark to latest version and check if it is fixed w/O change anything in hudi From: "nicolas paris" Date: Mon, Nov 20, 2023, 20:07 Subject: [External] Current state of parquet zstd OOM with hudi To: "Hudi Dev List" hey month