Re: Data-source/partition pruning with remotely stored non-parquet files

2019-02-20 Thread salim achouche
I'll take a stab at answering your questions. There are multiple strategies that Drill uses to reduce the amount of files to read: Implicit Columns based Pruning - o Drill exposes a set of implicit columns for file based readers (e.g., filename, path, etc) o You could use such columns to prune

Data-source/partition pruning with remotely stored non-parquet files

2019-02-19 Thread Lokendra Singh Panwar
Hi All, I am writing a custom storage plugin to read and query non-static json files stored on remote services and wanted to use something similar to Drill's partition pruning to optimise my queries. The files are looked dynamically within the plugin up via an external service based on the