Drill does not build indexes since it operates on top of underlying storage systems. It also does not have a data cache - any type of data caching in a distributed engine would require maintaining cache consistency with the source data as well as coherency across nodes. If you are running on top of a distributed file system, note that the file system itself caches hot data blocks.
However, it sounds to me that your use case would benefit with partitioned data layout. Drill has partitioning support as part of CTAS (create table AS) and partition pruning during queries. It sounds like your data is amenable to partitioning based on week. If you partition it such, your query could potentially leverage partition pruning (see https://drill.apache.org/docs/partition-pruning/ and https://community.mapr.com/thread/10254) and provide faster response times. -Aman On Thu, Jun 9, 2016 at 2:15 AM, Kiril Menshikov <[email protected]> wrote: > Hello, > > I have custom BI tool and want to have cache for queries. Most of the > queries are time periods, so I need to store last week data. > > I know Drill is big distributed system, but want leverage it to my needs or > create small arrow cache. How Drill make indexes? Can you point me to the > source code. > > Thanks, > -Kirils >
