Re: [DISCUSS] Ideas to improve metadata cache read performance

Parth Chandra Fri, 30 Oct 2015 15:02:40 -0700

Hey Jacques, Steven,

  Do we have a branch somewhere which has the initial prototype code? I'd
like to prune the file a bit as it looks like reducing the size of the
metadata cache file might yield the best results.


  Also, did we have a particular reason for going with JSON as opposed to a
more compact binary format? Are there any arguments against saving this as
a protobuf/BSON/Parquet file?

Parth

On Mon, Oct 26, 2015 at 2:42 PM, Jacques Nadeau <[email protected]> wrote:

> My first thought is we've gotten too generous in what we're storing in the
> Parquet metadata file. Early implementations were very lean and it seems
> far larger today. For example, early implementations didn't keep statistics
> and ignored row groups (files, schema and block locations only). If we need
> multiple levels of information, we may want to stagger (or normalize) them
> in the file. Also, we may think about what is the minimum that must be done
> in planning. We could do the file pruning at execution time rather than
> single-tracking these things (makes stats harder though).
>
> I also think we should be cautious around jumping to a conclusion until
> DRILL-3973 provides more insight.
>
> In terms of caching, I'd be more inclined to rely on file system caching
> and make sure serialization/deserialization is as efficient as possible as
> opposed to implementing an application-level cache. (We already have enough
> problems managing memory without having to figure out when we should drop a
> metadata cache :D).
>
> Aside, I always liked this post for entertainment and the thoughts on
> virtual memory: https://www.varnish-cache.org/trac/wiki/ArchitectNotes
>
>
> --
> Jacques Nadeau
> CTO and Co-Founder, Dremio
>
> On Mon, Oct 26, 2015 at 2:25 PM, Hanifi Gunes <[email protected]> wrote:
>
> > One more thing, for workloads running queries over subsets of same
> parquet
> > files, we can consider maintaining an in-memory cache as well. Assuming
> > metadata memory footprint per file is low and parquet files are static,
> not
> > needing us to invalidate the cache often.
> >
> > H+
> >
> > On Mon, Oct 26, 2015 at 2:10 PM, Hanifi Gunes <[email protected]>
> wrote:
> >
> > > I am not familiar with the contents of metadata stored but if
> > > deserialization workload seems to be fitting to any of afterburner's
> > > claimed improvement points [1] It could well be worth trying given the
> > > claimed gain on throughput is substantial.
> > >
> > > It could also be a good idea to partition caching over a number of
> files
> > > for better parallelization given number of cache files generated is
> > > *significantly* less than number of parquet files. Maintaining global
> > > statistics seems an improvement point too.
> > >
> > >
> > > -H+
> > >
> > > 1:
> > >
> >
> https://github.com/FasterXML/jackson-module-afterburner#what-is-optimized
> > >
> > > On Sun, Oct 25, 2015 at 9:33 AM, Aman Sinha <[email protected]>
> > wrote:
> > >
> > >> Forgot to include the link for Jackson's AfterBurner module:
> > >>   https://github.com/FasterXML/jackson-module-afterburner
> > >>
> > >> On Sun, Oct 25, 2015 at 9:28 AM, Aman Sinha <[email protected]>
> > wrote:
> > >>
> > >> > I was going to file an enhancement JIRA but thought I will discuss
> > here
> > >> > first:
> > >> >
> > >> > The parquet metadata cache file is a JSON file that contains a
> subset
> > of
> > >> > the metadata extracted from the parquet files.  The cache file can
> get
> > >> > really large .. a few GBs for a few hundred thousand files.
> > >> > I have filed a separate JIRA: DRILL-3973 for profiling the various
> > >> aspects
> > >> > of planning including metadata operations.  In the meantime, the
> > >> timestamps
> > >> > in the drillbit.log output indicate a large chunk of time spent in
> > >> creating
> > >> > the drill table to begin with, which indicates bottleneck in reading
> > the
> > >> > metadata.  (I can provide performance numbers later once we confirm
> > >> through
> > >> > profiling).
> > >> >
> > >> > A few thoughts around improvements:
> > >> >  - The jackson deserialization of the JSON file is very slow.. can
> > this
> > >> be
> > >> > speeded up ? .. for instance the AfterBurner module of jackson
> claims
> > to
> > >> > improve performance by 30-40% by avoiding the use of reflection.
> > >> >  - The cache file read is a single threaded process.  If we were
> > >> directly
> > >> > reading from parquet files, we use a default of 16 threads.  What
> can
> > be
> > >> > done to parallelize the read ?
> > >> >  - Any operation that can be done one time during the REFRESH
> METADATA
> > >> > command ?  for instance..examining the min/max values to determine
> > >> > single-value for partition column could be eliminated if we do this
> > >> > computation during REFRESH METADATA command and store the summary
> one
> > >> time.
> > >> >
> > >> >  - A pertinent question is: should the cache file be stored in a
> more
> > >> > efficient format such as Parquet instead of JSON ?
> > >> >
> > >> > Aman
> > >> >
> > >> >
> > >>
> > >
> > >
> >
>

Re: [DISCUSS] Ideas to improve metadata cache read performance

Reply via email to