Surely we can work together once we get some feedback on the RFC Meng! On Thu, Aug 11, 2022 at 9:32 AM 1037817390 <mengtao0...@qq.com.invalid> wrote:
> +1 for this > it will be better to provide some filter converters to faciliate the > integration of the engine: > eg: converter presto domain to hudi domain > > > > and i have already finish the first version of dataskipping/partition > prune/filter pushdown for presto, > > https://github.com/xiarixiaoyao/presto/commit/800646608d4b88799de0addcddd97d03592954ce > > maybe we can work together > > > > > > > > 孟涛 > mengtao0...@qq.com > > > > > > > > > ------------------ 原始邮件 ------------------ > 发件人: > "dev" > < > vin...@apache.org>; > 发送时间: 2022年8月11日(星期四) 中午12:11 > 收件人: "dev"<dev@hudi.apache.org>; > > 主题: Re: [DISCUSS]: Integrate column stats index with all query engines > > > > +1 for this. > > Suggested new reviewers on the RFC. > https://github.com/apache/hudi/pull/6345/files#r943073339 > > On Wed, Aug 10, 2022 at 9:56 PM Pratyaksh Sharma <pratyaks...@gmail.com > > > wrote: > > > Hello community, > > > > With the introduction of multi modal index in Hudi, there is a lot of > scope > > for improvement on the querying side. There are 2 major ways of > reducing > > the data scan at the time of querying - partition pruning and file > pruning. > > While with the latest developments in the community, partition > pruning is > > supported for commonly used query engines like spark, presto and > hive, File > > pruning using column stats index is only supported for spark and > flink. > > > > We intend to support data skipping for the rest of the engines as well > > which include hive, presto and trino. I have written a draft RFC here > - > > https://github.com/apache/hudi/pull/6345. > > > > Please take a look and let me know what you think. Once we have some > > feedback from the community, we can decide on the next steps. > >