I re-attached the image as follows. In case it still doesn't show up, the
average point lookup throughput of* SSD for LSM + Logging* is only around
*3-4k/s*. When a separate hard disk is used for logging, the average point
lookup throughput reaches *30k-40k/s*.

[image: image.png]

Best regards,
Chen Luo

On Thu, Feb 21, 2019 at 10:01 AM abdullah alamoudi <[email protected]>
wrote:

> Thanks for sharing Chen, very interesting.
>
> The image doesn't show up for me. Not sure if it shows up for others?
>
> Cheers,
> Abdullah.
>
> On Wed, Feb 20, 2019 at 1:29 PM Chen Luo <[email protected]> wrote:
>
> > Hi Devs,
> >
> > Recently I've been running experiments with concurrent ingestions and
> > queries on SSDs. I'd like to share an important lesson from my
> experiments.
> > In short,* it is very important (from the performance perspective) to use
> > a separate disk for logging, even SSDs are good at random I/Os*.
> >
> > The following experiment illustrates this point. I was using YCSB with
> > 100GB base data (100M records, each has 1KB). During each experiment,
> there
> > was a constant data arrival process of 3600 records/s. I executed
> > concurrent point lookups (uniformly distributed) as much as possible
> using
> > 16 query threads (to saturate the disk). The page size was set to 4KB.
> The
> > experiments were performed on SSDs. The only difference is that one
> > experiment had a separate hard disk for logging, while the other used the
> > same SSD for both LSM and logging. The point lookup throughput over time
> > was plotted below. The negative impact of logging is huge!
> >
> > [image: image.png]
> >
> > The reason is that logging needs to frequently force disk writes (in this
> > experiment, the log flusher forces 70-80 times per second). Even though
> the
> > disk bandwidth used by the log flusher is small (4-5MB/s), the frequent
> > disk forces could seriously impact the overall disk throughput. If you
> have
> > a workload with concurrent data ingestion and queries, please DO consider
> > using a separate disk for logging to fully utilize the SSD bandwidth.
> >
> > Best regards,
> > Chen Luo
> >
>

Reply via email to