Thanks a lot! My proof-of-concept code works OK.

I do not understand all subtle details of mmap reliability, could you
please help with these two:

If I write data to a pointer to an opaque blob as discussed above, and my
process crashes before mdb_env_sync, but OS doesn't crash - will that data
be secure in the mmap file?

Also, am I correct that mdb_env_sync synchronizes all dirty pages in the
mmap file as seen by a file system, regardless how they were modified -
either via LMDB API or via a direct pointer writes?

As for "you could at least set a callback to notify you that a block has
moved"  - if that is implemented, it would be nice to have a notification
*before* a block is moved (with old and new address, so that right after
the callback it is OK to use the new address), otherwise this non-intended
but convenient use of LMDB won't work anymore.


Best regards,
Victor



On Sat, Oct 3, 2015 at 1:27 AM, Howard Chu <[email protected]> wrote:

> Howard Chu wrote:
>
>> Victor Baybekov wrote:
>>
>>> Thank you! I understand this copy-on-write behavior, but am interested
>>> if I
>>> could control it a little. What if I use records that are always much
>>> bigger
>>> than a single page, e.g. 100 kb with 4kb pages, and make sure that a
>>> record is
>>> never updated (via LMDB means) during a lifetime of an environment, - is
>>> there
>>> any scenario that the location of such a big record could be changed
>>> during a
>>> lifetime of an environment, without updating the record?
>>>
>>
>> At this point in time, no, if you don't update a large record there is no
>> reason that it will move. That is not to say that this won't change in the
>> future. The documentation tells you what promises we are willing to make.
>> Relying on any non-documented behavior is your own responsibility.
>>
>
> Note that the relocation functions in LMDB are intended to accommodate
> blocks being moved around. The actual guts of that API haven't been
> implemented, but probably in 1.x we'll flesh them out. Given that support,
> you could at least set a callback to notify you that a block has moved. But
> currently, overflow pages don't move if they're not modified.
>
>
>>>
>>>
>>> On Fri, Oct 2, 2015 at 4:38 PM, Howard Chu <[email protected]
>>> <mailto:[email protected]>> wrote:
>>>
>>>     Victor Baybekov wrote:
>>>
>>>         Hi,
>>>
>>>         Docs for MDB_RESERVE say that a returned pointer to the reserved
>>> space is
>>>         valid "before the next update operation or the transaction
>>> ends." Docs
>>>         for MDB_WRITEMAP say that it "writes directly to the mmap
>>> instead of
>>> using
>>>         malloc for pages." Does combining the two options return a
>>> pointer
>>>         directly to
>>>         a place in a mmap
>>>
>>>
>>>     Yes.
>>>
>>>         so that this pointer could be used after a transaction ends
>>>         or after the next update?
>>>
>>>
>>>     No.
>>>
>>>     Longer answer: maybe.
>>>
>>>     Full answer: LMDB is copy-on-write. If you update another record on
>>> the
>>>     same page, in a later transaction, the contents of that page will be
>>>     copied to a new page and the original page will go onto the
>>> freelist. In
>>>     that case, the pointer you got must not be used again.
>>>
>>>     If you don't directly update that page and cause it to be copied,
>>> then you
>>>     might get lucky and be able to use the pointer for a while. It all
>>> depends
>>>     on what other modifications you do and how they affect that node or
>>>     neighboring nodes.
>>>
>>>
>>>         I have a use case where I want to somewhat abuse LMDB safety for
>>>         convenience.
>>>         If I could get a pointer to a place inside a mmap I could work
>>> with
>>>         LMDB value
>>>         as opaque blob or as a region inside the single big mmap. This
>>> could
>>>         be more
>>>         convenient than creating and opening hundreds of temporary memory
>>>         mapped files
>>>         and keeping open handles to them. For example, Aeron terms could
>>> be
>>> stored
>>>         like this: a stream id per an LMDB db and a term id for a key in
>>> the
>>> db.
>>>
>>>
>>>         Thanks!
>>>         Victor
>>>
>>>
>>>
>>>     --
>>>        -- Howard Chu
>>>        CTO, Symas Corp. http://www.symas.com
>>>        Director, Highland Sun http://highlandsun.com/hyc/
>>>        Chief Architect, OpenLDAP http://www.openldap.org/project/
>>>
>>>
>>>
>>
>>
>
> --
>   -- Howard Chu
>   CTO, Symas Corp.           http://www.symas.com
>   Director, Highland Sun     http://highlandsun.com/hyc/
>   Chief Architect, OpenLDAP  http://www.openldap.org/project/
>

Reply via email to