Hey Jacques, Steven, Do we have a branch somewhere which has the initial prototype code? I'd like to prune the file a bit as it looks like reducing the size of the metadata cache file might yield the best results.
Also, did we have a particular reason for going with JSON as opposed to a more compact binary format? Are there any arguments against saving this as a protobuf/BSON/Parquet file? Parth On Mon, Oct 26, 2015 at 2:42 PM, Jacques Nadeau <[email protected]> wrote: > My first thought is we've gotten too generous in what we're storing in the > Parquet metadata file. Early implementations were very lean and it seems > far larger today. For example, early implementations didn't keep statistics > and ignored row groups (files, schema and block locations only). If we need > multiple levels of information, we may want to stagger (or normalize) them > in the file. Also, we may think about what is the minimum that must be done > in planning. We could do the file pruning at execution time rather than > single-tracking these things (makes stats harder though). > > I also think we should be cautious around jumping to a conclusion until > DRILL-3973 provides more insight. > > In terms of caching, I'd be more inclined to rely on file system caching > and make sure serialization/deserialization is as efficient as possible as > opposed to implementing an application-level cache. (We already have enough > problems managing memory without having to figure out when we should drop a > metadata cache :D). > > Aside, I always liked this post for entertainment and the thoughts on > virtual memory: https://www.varnish-cache.org/trac/wiki/ArchitectNotes > > > -- > Jacques Nadeau > CTO and Co-Founder, Dremio > > On Mon, Oct 26, 2015 at 2:25 PM, Hanifi Gunes <[email protected]> wrote: > > > One more thing, for workloads running queries over subsets of same > parquet > > files, we can consider maintaining an in-memory cache as well. Assuming > > metadata memory footprint per file is low and parquet files are static, > not > > needing us to invalidate the cache often. > > > > H+ > > > > On Mon, Oct 26, 2015 at 2:10 PM, Hanifi Gunes <[email protected]> > wrote: > > > > > I am not familiar with the contents of metadata stored but if > > > deserialization workload seems to be fitting to any of afterburner's > > > claimed improvement points [1] It could well be worth trying given the > > > claimed gain on throughput is substantial. > > > > > > It could also be a good idea to partition caching over a number of > files > > > for better parallelization given number of cache files generated is > > > *significantly* less than number of parquet files. Maintaining global > > > statistics seems an improvement point too. > > > > > > > > > -H+ > > > > > > 1: > > > > > > https://github.com/FasterXML/jackson-module-afterburner#what-is-optimized > > > > > > On Sun, Oct 25, 2015 at 9:33 AM, Aman Sinha <[email protected]> > > wrote: > > > > > >> Forgot to include the link for Jackson's AfterBurner module: > > >> https://github.com/FasterXML/jackson-module-afterburner > > >> > > >> On Sun, Oct 25, 2015 at 9:28 AM, Aman Sinha <[email protected]> > > wrote: > > >> > > >> > I was going to file an enhancement JIRA but thought I will discuss > > here > > >> > first: > > >> > > > >> > The parquet metadata cache file is a JSON file that contains a > subset > > of > > >> > the metadata extracted from the parquet files. The cache file can > get > > >> > really large .. a few GBs for a few hundred thousand files. > > >> > I have filed a separate JIRA: DRILL-3973 for profiling the various > > >> aspects > > >> > of planning including metadata operations. In the meantime, the > > >> timestamps > > >> > in the drillbit.log output indicate a large chunk of time spent in > > >> creating > > >> > the drill table to begin with, which indicates bottleneck in reading > > the > > >> > metadata. (I can provide performance numbers later once we confirm > > >> through > > >> > profiling). > > >> > > > >> > A few thoughts around improvements: > > >> > - The jackson deserialization of the JSON file is very slow.. can > > this > > >> be > > >> > speeded up ? .. for instance the AfterBurner module of jackson > claims > > to > > >> > improve performance by 30-40% by avoiding the use of reflection. > > >> > - The cache file read is a single threaded process. If we were > > >> directly > > >> > reading from parquet files, we use a default of 16 threads. What > can > > be > > >> > done to parallelize the read ? > > >> > - Any operation that can be done one time during the REFRESH > METADATA > > >> > command ? for instance..examining the min/max values to determine > > >> > single-value for partition column could be eliminated if we do this > > >> > computation during REFRESH METADATA command and store the summary > one > > >> time. > > >> > > > >> > - A pertinent question is: should the cache file be stored in a > more > > >> > efficient format such as Parquet instead of JSON ? > > >> > > > >> > Aman > > >> > > > >> > > > >> > > > > > > > > >
