On 04/29/2015 02:03 AM, Dan Kennedy wrote:
> On 04/29/2015 01:55 AM, Scott Robison wrote:
>> On Tue, Apr 28, 2015 at 7:08 AM, Hick Gunter <hick at scigames.at> wrote:
>>
>>> Getting "NoMem" sounds very much like a memory leak somewhere, with the
>>> most likely place being your own application, followed by the 
>>> wrapper you
>>> are using, the FTS code and lastly the SQLite core. Lastly because the
>>> SQLite core is extensively tested with an explicit emphasis on not 
>>> leaking
>>> memory (or other resources) in the first place and secondly recovering
>>> gracefully from memory allocation failures.
>>>
>> I've seen the same thing from the plain old amalgamation (not 
>> sqlite.net).
>> It only happens on *HUGE* (multiples of gigabytes) data sets. At 
>> least in
>> my case, it was not a memory leak.
>>
>> It's been a couple of years since I encountered it, and I worked 
>> around it
>> from the presumption that the data set used to stress test FTS was 
>> atypical
>> and wouldn't be encountered in the wild. Here are the details as best 
>> as I
>> can remember them:
>>
>> While inserting records into the FTS table, multiple FTS b-tree 
>> structures
>> are created. These are not the same b-trees used in plain vanilla 
>> SQLite.
>> Periodically as multiple b-trees are created and grow to some size, the
>> multiple b-trees are merged into a single b-tree.
>>
>> This merge operation allocates chunks of memory proportionate to the 
>> size
>> of the b-trees being merged. Using a contrived example that is not 
>> exact,
>> just illustrative:
>>
>> Set of inserts until two b-trees of one megabyte each are present. Merge
>> them into a two megabyte b-tree.
>>
>> Merge 2 2MiB trees into 1 4MiB tree.
>>
>> 2 x 4 MiB = 8 MiB.
>>
>> lather rinse repeat.
>>
>> 2 x 1 GiB = 2 GiB but probably fails due to overhead; if not...
>>
>> 2 x 2 GiB = 4 GiB but almost certainly fails due to overhead; if not...
>>
>> 2 x 4 GiB = 8 GiB definitely fails on a 32 bit system.
>>
>> In reality I never got to the point of allocating chunks of memory that
>> large. The failure happened well under 2 GiB (somewhere within a few
>> hundred MiB of the 1 GiB limit) due to other allocations and OS 
>> overhead.
>>
>> I just took a quick glance at the FTS code. As I said, it has been a 
>> couple
>> years, but this looks like the malloc that was failing for me at the 
>> time:
>> http://www.sqlite.org/cgi/src/artifact/81f9ed55ad586148?ln=2473
>
> That one is allocating enough space for the doclist associated with a 
> single term. Doclists are between say 2 and 12 bytes in size for each 
> instance of a term in the document set. So they can get quite large 
> for common terms ("a", "the" etc.). And the OP does have over a 
> billion documents. So I guess if there is a fairly common term in 
> there, that allocation could be too large for the OS to satisfy.

Or, really, 32-bit overflow resulting in a negative value being passed 
to sqlite3_malloc() causing the OOM report. Huh.




Reply via email to