Re: Accumulo Seek performance

Adam J. Shook Mon, 12 Sep 2016 14:56:14 -0700

As an aside, this is actually pretty relevant to the work I've been doing
for Presto/Accumulo integration.  It isn't uncommon to have around a
million exact Ranges (that is, Ranges with a single row ID)  spread across
the five Presto worker nodes we use for scanning Accumulo.  Right now,
these ranges get packed into PrestoSplits, 10k ranges per split (an
arbitrary number I chose), and each split is run in parallel (depending on
the overall number of splits, they may be queued for execution).


I'm curious to see the query impact of changing it to use a fixed thread
pool of Scanners over the current BatchScanner implementation.  Maybe I'll
play around with it sometime soon.

--Adam

On Mon, Sep 12, 2016 at 2:47 PM, Dan Blum <db...@bbn.com> wrote:

> I think the 450 ranges returned a total of about 7.5M entries, but the
> ranges were in fact quite small relative to the size of the table.
>
> -----Original Message-----
> From: Josh Elser [mailto:josh.el...@gmail.com]
> Sent: Monday, September 12, 2016 2:43 PM
> To: user@accumulo.apache.org
> Subject: Re: Accumulo Seek performance
>
> What does a "large scan" mean here, Dan?
>
> Sven's original problem statement was running many small/pointed Ranges
> (e.g. point lookups). My observation was that BatchScanners were slower
> than running each in a Scanner when using multiple BS's concurrently.
>
> Dan Blum wrote:
> > I tested a large scan on a 1.6.2 cluster with 11 tablet servers - using
> Scanners was much slower than using a BatchScanner with 11 threads, by
> about a 5:1 ratio. There were 450 ranges.
> >
> > -----Original Message-----
> > From: Josh Elser [mailto:josh.el...@gmail.com]
> > Sent: Monday, September 12, 2016 1:42 PM
> > To: user@accumulo.apache.org
> > Subject: Re: Accumulo Seek performance
> >
> > I had increased the readahead threed pool to 32 (from 16). I had also
> > increased the minimum thread pool size from 20 to 40. I had 10 tablets
> > with the data block cache turned on (probably only 256M tho).
> >
> > Each tablet had a single file (manually compacted). Did not observe
> > cache rates.
> >
> > I've been working through this with Keith on IRC this morning too. Found
> > that a single batchscanner (one partition) is faster than the Scanner.
> > Two partitions and things started to slow down.
> >
> > Two interesting points to still pursue, IMO:
> >
> > 1. I saw that the tserver-side logging for MultiScanSess was near
> > identical to the BatchScanner timings
> > 2. The minimum server threads did not seem to be taking effect. Despite
> > having the value set to 64, I only saw a few ClientPool threads in a
> > jstack after running the test.
> >
> > Adam Fuchs wrote:
> >> Sorry, Monday morning poor reading skills, I guess. :)
> >>
> >> So, 3000 ranges in 40 seconds with the BatchScanner. In my past
> >> experience HDFS seeks tend to take something like 10-100ms, and I would
> >> expect that time to dominate here. With 60 client threads your
> >> bottleneck should be the readahead pool, which I believe defaults to 16
> >> threads. If you get perfect index caching then you should be seeing
> >> something like 3000/16*50ms = 9,375ms. That's in the right ballpark, but
> >> it assumes no data cache hits. Do you have any idea of how many files
> >> you had per tablet after the ingest? Do you know what your cache hit
> >> rate was?
> >>
> >> Adam
> >>
> >>
> >> On Mon, Sep 12, 2016 at 9:14 AM, Josh Elser<josh.el...@gmail.com
> >> <mailto:josh.el...@gmail.com>>  wrote:
> >>
> >>      5 iterations, figured that would be apparent from the log messages
> :)
> >>
> >>      The code is already posted in my original message.
> >>
> >>      Adam Fuchs wrote:
> >>
> >>          Josh,
> >>
> >>          Two questions:
> >>
> >>          1. How many iterations did you do? I would like to see an
> absolute
> >>          number of lookups per second to compare against other
> observations.
> >>
> >>          2. Can you post your code somewhere so I can run it?
> >>
> >>          Thanks,
> >>          Adam
> >>
> >>
> >>          On Sat, Sep 10, 2016 at 3:01 PM, Josh Elser
> >>          <josh.el...@gmail.com<mailto:josh.el...@gmail.com>
> >>          <mailto:josh.el...@gmail.com<mailto:josh.el...@gmail.com>>>
> wrote:
> >>
> >>               Sven, et al:
> >>
> >>               So, it would appear that I have been able to reproduce
> this one
> >>               (better late than never, I guess...). tl;dr Serially using
> >>          Scanners
> >>               to do point lookups instead of a BatchScanner is ~20x
> >>          faster. This
> >>               sounds like a pretty serious performance issue to me.
> >>
> >>               Here's a general outline for what I did.
> >>
> >>               * Accumulo 1.8.0
> >>               * Created a table with 1M rows, each row with 10 columns
> >>          using YCSB
> >>               (workloada)
> >>               * Split the table into 9 tablets
> >>               * Computed the set of all rows in the table
> >>
> >>               For a number of iterations:
> >>               * Shuffle this set of rows
> >>               * Choose the first N rows
> >>               * Construct an equivalent set of Ranges from the set of
> Rows,
> >>               choosing a random column (0-9)
> >>               * Partition the N rows into X collections
> >>               * Submit X tasks to query one partition of the N rows (to
> a
> >>          thread
> >>               pool with X fixed threads)
> >>
> >>               I have two implementations of these tasks. One, where all
> >>          ranges in
> >>               a partition are executed via one BatchWriter. A second
> >>          where each
> >>               range is executed in serial using a Scanner. The numbers
> >>          speak for
> >>               themselves.
> >>
> >>               ** BatchScanners **
> >>               2016-09-10 17:51:38,811 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Shuffled
> >>               all rows
> >>               2016-09-10 17:51:38,843 [joshelser.YcsbBatchScanner] INFO
> : All
> >>               ranges calculated: 3000 ranges found
> >>               2016-09-10 17:51:38,846 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 40178 ms
> >>               2016-09-10 17:52:19,025 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 42296 ms
> >>               2016-09-10 17:53:01,321 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:53:47,414 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 46094 ms
> >>               2016-09-10 17:53:47,415 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:54:35,118 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 47704 ms
> >>               2016-09-10 17:54:35,119 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:55:24,339 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 49221 ms
> >>
> >>               ** Scanners **
> >>               2016-09-10 17:57:23,867 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Shuffled
> >>               all rows
> >>               2016-09-10 17:57:23,898 [joshelser.YcsbBatchScanner] INFO
> : All
> >>               ranges calculated: 3000 ranges found
> >>               2016-09-10 17:57:23,903 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 2833 ms
> >>               2016-09-10 17:57:26,738 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 2536 ms
> >>               2016-09-10 17:57:29,275 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 2150 ms
> >>               2016-09-10 17:57:31,425 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 2061 ms
> >>               2016-09-10 17:57:33,487 [joshelser.YcsbBatchScanner] INFO
> :
> >>               Executing 6 range partitions using a pool of 6 threads
> >>               2016-09-10 17:57:35,628 [joshelser.YcsbBatchScanner] INFO
> :
> >>          Queries
> >>               executed in 2140 ms
> >>
> >>               Query code is available
> >>          https://github.com/joshelser/accumulo-range-binning
> >>          <https://github.com/joshelser/accumulo-range-binning>
> >>          <https://github.com/joshelser/accumulo-range-binning
> >>          <https://github.com/joshelser/accumulo-range-binning>>
> >>
> >>
> >>               Sven Hodapp wrote:
> >>
> >>                   Hi Keith,
> >>
> >>                   I've tried it with 1, 2 or 10 threads. Unfortunately
> >>          there where
> >>                   no amazing differences.
> >>                   Maybe it's a problem with the table structure? For
> >>          example it
> >>                   may happen that one row id (e.g. a sentence) has
> several
> >>                   thousand column families. Can this affect the seek
> >>          performance?
> >>
> >>                   So for my initial example it has about 3000 row ids to
> >>          seek,
> >>                   which will return about 500k entries. If I filter for
> >>          specific
> >>                   column families (e.g. a document without annotations)
> >>          it will
> >>                   return about 5k entries, but the seek time will only
> be
> >>          halved.
> >>                   Are there to much column families to seek it fast?
> >>
> >>                   Thanks!
> >>
> >>                   Regards,
> >>                   Sven
> >>
> >>
> >>
> >
>
>

Re: Accumulo Seek performance

Reply via email to