Re: Unexpected LMDB RSS /performance difference on similar machines

Howard Chu Sun, 07 Jun 2020 13:45:20 -0700

Alec Matusis wrote:
> Hi again Howard,
> 
> Sorry for the confusion with two different machines, but I have a question 
> about just one machine. 
> 
> I observe two things on a single machine:
> 
> 1 .My test binary with MDB_NORDAHEAD  reads 13GB into shared memory and 83GB  
> without MDB_NORDAHEAD . The cold read shows about 10M/s sustained read speed 
> on iotop and takes 18m.
> Then I do 
> dd if=/fusionio1/lmdb/db.0/dbgraph/data.mdb of=/dev/null bs=8k
> dd shows the read speed of 300M/s, i.e. 30x faster than looping over 
> read-only cursor. Can anything (other than removing MDB_NORDAHEAD)  be done 
> to reduce this30x read speed difference on the first cold read?


Just run dd before starting your main program ...

dd is reading sequentially, so of course it can stream the data at higher speed.
When you're reading through a cursor, the data pages are most likely scattered
throughout the file. Physical random accesses will always be slower than 
sequential reads.

If you leave readahead enabled, you'll get a higher proportion of sequential 
read
bursts, but it will still be a lot of random access.

> 2. dd reads the entire environment file into system file buffers (93GB). Then 
> when the entire environment is cached, I run the binary with MDB_NORDAHEAD, 
> but now it reads 80GB into shared memory, like when MDB_NORDAHEAD  is not 
> set. Is this expected? Can it be prevented?

It's not reading anything, since the data is already cached in memory.

Is this expected? Yes - the data is already present, and LMDB always
requests a single mmap for the entire size of the environment. Since
the physical memory is already assigned, the mmap contains it all.

Can it be prevented - why does it matter? If any other process needs
to use the RAM, it will get it automatically.

> 
> 
> 
> -----Original Message-----
> From: Howard Chu [mailto:[email protected]] 
> Sent: Friday, June 05, 2020 5:38 AM
> To: Alec Matusis <[email protected]>; [email protected]
> Subject: Re: Unexpected LMDB RSS /performance difference on similar machines
> 
> Howard Chu wrote:
>> Alec Matusis wrote:
>>>> Try repeating the test with MDB_NORDAHEAD set on the environment.
>>>
>>> Thank you: with MDB_NORDAHEAD it works on both machines as expected. 
>>> We have a couple of questions and observations.
>>>
>>> We have:
>>> machine 1:
>>>   XFS filesystem, 148GB RAM, 3.13
>>>   # blockdev --getra /dev/fiob
>>>     256
>>>    Shared memory grows to 13GB with or without MDB_NORDAHEAD (as 
>>> expected)
>>>
>>> machine 2:
>>>   EXT4 filesystem, 105GB RAM, 4.15
>>>   # blockdev --getra /dev/vda2
>>>     256
>>>   Shared memory grows to 83GB without MDB_NORDAHEAD (unexpected) and 
>>> to 13GB with MDB_NORDAHEAD  (as expected)
>>>
>>> Questions and observations: 
>>>   1. Since “blockdev --getra” shows the same 256 for both machines, 
>>> why MDB_NORDAHEAD  was necessary only on machine2?
>>
>> This is a stupid question. You claimed both machines have similar 
>> setups and yet they are running wildly different kernel versions and 
>> using completely different filesystems, and now you wonder why they behave 
>> differently??
>>
>> None of this has anything to do with LMDB. Ask a filesystem or kernel 
>> developer.
> 
> For anyone just tuning in - we demonstrated from day 1 the huge difference in 
> performance between different filesystems.
> 
> http://www.lmdb.tech/bench/microbench/july/#sec11
-- 
  -- Howard Chu
  CTO, Symas Corp.           http://www.symas.com
  Director, Highland Sun     http://highlandsun.com/hyc/
  Chief Architect, OpenLDAP  http://www.openldap.org/project/

Re: Unexpected LMDB RSS /performance difference on similar machines

Reply via email to