https://bugs.openldap.org/show_bug.cgi?id=10054

          Issue ID: 10054
           Summary: Value size limited to 2,147,479,552 bytes
           Product: LMDB
           Version: unspecified
          Hardware: x86_64
                OS: Linux
            Status: UNCONFIRMED
          Keywords: needs_review
          Severity: normal
          Priority: ---
         Component: liblmdb
          Assignee: [email protected]
          Reporter: [email protected]
  Target Milestone: ---

Hello,

According to the documentation[0], a database that is not using `MDB_DUPSORT`
can store values up to `0xffffffff` bytes (around 4GB).

In practice, under Linux, the actual limit is `0x7ffff000` though (2^31 - 4096,
so around 2GB).

This is due to the write loop in `mdb_page_flush`. The `wsize` value
determining how many bytes will be written can be as big as
`4096*dp->mp_pages`[1], and the number of overflow pages grows with the size of
the value put inside the DB.

The `wsize` is not split in smaller chunks in the case where there are many
overflow pages to write, and as a result the call to `pwrite`[2] does not
perform a full write, but only a "short" write of 2147479552 bytes (the maximum
allowed on a call to `pwrite` on Linux[3]).

This would be OK if the short write condition was handled by looping and
performing another `pwrite` with the rest of the data, but instead `EIO` is
returned[4].

There seems to be a related, but different issue on macOS when trying to
`pwrite` more the 2^31 bytes, that was already reported[5].

This issue was reported to me by a Meilisearch user because it causes their
database indexing to fail[6]. I had to investigate a bit because their setup
was peculiar (high number of documents in their database) and the `EIO` error
code is not very descriptive of the underlying issue.

I join a C reproducer of the issue that attempts to add a 2147479553 bytes
value to the DB and fails with `EIO` (decreasing `nb_items` to a smaller value
such as `2107479552` does succeed)[7].

Thank you for making LMDB!
Louis Dureuil.

[0]:
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/lmdb.h#LL284C48-L284C58
[1]:
https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#LL3770C33-L3770C45
[2]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L3820
[3]:
https://stackoverflow.com/questions/70368651/why-cant-linux-write-more-than-2147479552-bytes
[4]: https://github.com/LMDB/lmdb/blob/mdb.master/libraries/liblmdb/mdb.c#L3840
[5]: https://bugs.openldap.org/show_bug.cgi?id=9736
[6]: https://github.com/meilisearch/meilisearch/issues/3654
[7]: https://github.com/dureuill/lmdb_3654/tree/main

-- 
You are receiving this mail because:
You are on the CC list for the issue.

Reply via email to