Hi Recently attempted to use prestodb and liked it and came across the hudi project. After going through a few pages and the source code it seems that hudi has its own inputFormat.
I am attempting to do an early predicate pushdown on my own side and the question is probably not related to hudi .. but just wanted to get an idea if someone with experience using both prestodb and hive could enlighten me on. Imagine I create a hive table using my own customInputFormat. I see that Hudi has contributed an annotation which allows prestodb to invoke the splits from the customInputFormat. for simplicity the hive table consists of two columns someid, anotherid Imagine files in hdfs are laid out as /some/folder/someid.anotherid.someformat and a query such as select * from hive_table where anotherid = abc. what i want to attempt to do is to capture the above query so that when the prestodb queries hivemetadata for the table and returns my customInputFormat then i could potentially in the getSplit method use a glob expression to filter out and grab only those files which satisfy the condition anotherid=abc before the handoff to the query execution in presto. any pointers would be useful. Thanks,
