bit_array.c

Ivan Zhakov Tue, 29 Apr 2014 07:30:18 -0700

On 29 April 2014 17:54, Stefan Fuhrmann <stefan.fuhrm...@wandisco.com> wrote:
> On Mon, Apr 28, 2014 at 8:11 AM, Ivan Zhakov <i...@visualsvn.com> wrote:
>>
>> eOn 27 April 2014 19:27,  <stef...@apache.org> wrote:
>> > Author: stefan2
>> > Date: Sun Apr 27 15:27:46 2014
>> > New Revision: 1590405
>> >
>> > URL: http://svn.apache.org/r1590405
>> > Log:
>> > More 'svn log -g' memory usage reduction.  We use a hash to keep track
>> > of all revisions reported so far, i.e. easily a million.
>> >
>> Hi Stefan,
>>
>> Interesting findings, some comments below.
>>
>> > That is 48 bytes / rev, allocated in small chunks.  The first results
>> > in 10s of MB dynamic memory usage while the other results in many 8k
>> > blocks being mmap()ed risking reaching the pre-process limit on some
>> > systems.
>> I don't understand this argument: why small allocations result 10s of
>> memory usage? Does not pool allocator aggregates small memory
>> allocations to 8k blocks?
>
>
> 1M x 48 bytes = 10s of MB. There are two problems
> I'm addressing here for 'svn log -g' (log without -g does
> not have those issues):
>
ack.


> * --limit applies to "top-level" revisions, not the merged ones.
>   If you log for some integration branch, it may show only a
>   few top-level revs but, say, 100k merged revs. That is fine
>   with 1.8 and even more so 1.9 as we deliver the info quickly.
>   But the server memory usage should remain in check even
>   for more extreme scenarios / repo sizes.
>
> * Some system provided APR (1.5+ in particular) uses mmap
>   to allocate memory. I.e. for every block, e.g. 8k, there is a
>   separate mmap call. The Linux default is 65530 (sic!) mmap
>   regions per process. Slowly allocating pools can trigger OOM
>   errors after only 512MB actual memory usage (sum across
>   all threads). I already prepared a patch for that.
>
Ouch, I didn't know that. I was thinking that MMAP APR pool allocator
is experimental and is not enabled by default.

>> > We introduce a simple packed bit array data structure to replace
>> > the hash.  For repos < 100M revs, the initialization overhead is less
>> > than 1ms and will amortize as soon as more than 1% of all revs are
>> > reported.
>> >
>>
>> It may be worth implement the same trick like we done with
>> membuffer_cache: use array of bit arrays for every 100k of revisions
>> for example and initialize them lazy. I mean:
>> [0...99999] - bit array 0
>> [100000....199999] -- bit array 1
>> ...
>>
>> It should be easy to implement.
>
>
> I gave it a try and it turned out not too horribly complex.
> See r1590982.
Great!

But it may be worth to keep original svn_bit_array and add new
svn_sparse_bit_array() with array of svn_bit_array() objects So things
will be separated in two micro layers.


-- 
Ivan Zhakov
CTO | VisualSVN | http://www.visualsvn.com

Re: svn commit: r1590405 - in /subversion/trunk: build.conf subversion/include/private/svn_subr_private.h subversion/libsvn_repos/log.c subversion/libsvn_subr/bit_array.c

Reply via email to