Thank you for sharing, those were quite interesting. On Fri, Mar 11, 2022 at 10:43 AM Xinli shang <[email protected]> wrote:
> Hi all, > > Uber EngBlog site just pushed two articles about Apache Parquet: Cost > Efficiency @ Scale in Big Data File Format > <https://eng.uber.com/cost-efficiency-big-data/> and One Stone, Three > Birds: Finer-Grained Encryption @ Apache Parquetâ„¢ > < > https://eng.uber.com/one-stone-three-birds-finer-grained-encryption-apache-parquet/ > >. > Please checkout out! > > > The first one is about how to use Parquet ZSTD, Column Prunning(deletion) > tool, Precision Reduction, Multi-Column Ordering, and fast translation tool > in Parquet to reduce storage space to improve cost efficiency. This project > alone saves the storage size at hundred PB level which is equivalent to > several millions of dollars savings per year. > > The second one talks about using Apache Parquet's fine-grained encryption > feature to solve three challenges: encryption, access control, and data > retention! This wraps up the work we have done with the community in the > last 3 years around Parquet Modular Encryption. I would like to thank Gidon > for his continuous collaborations with us! > > If you have any questions about the blog, feel free to reach out! > > Xinli Shang > > Tech Lead Manager at Uber Data Infra > > VP Apache Parquet PMC Chair > -- Aaron Niskode-Dossett, Data Engineering -- Etsy
