I know to avoid duplication is the saftest solution, but it usually
introduces a lot of labor works. I rather like to let platform do
that for us, or pretend it can do. We can use the idea of shm_open(3)
to merge duplications of big strings. The most easy way to do that is
a specialized memcpy. It looks like
void *
wrap_memcpy(void *dest, void *src, int n) {
if (n < 0) return memcpy(....);
if (!find_in_backstorages(src))
make_backstorage(src, n);
return map_to_backstorage(dest, find_in_backstorages(src), n);
}
It losts a little CPU time for small-sized copying, although it is
costly for big-sized copying. By wrapping memcpy(), we can get what
we want the platform/kernel to do.
Maybe, we should apply this specialized memcpy only for well picked
functions, for example sqlite. This would be more easy than avoiding
duplication.
I guess we only need to keep 2 or 3 back storages for every process,
and forget old storages in LRU. It is no normal, and not our purpose,
to handle big duplications for a large number with different in
content.
From: Ting-Yuan Huang <[email protected]>
Subject: Re: [b2g] Bug 850175
Date: Sat, 27 Apr 2013 23:59:20 -0700 (PDT)
> Oh, I see. I should read the codes more carefully :(
>
> But when memory are quickly allocated, it is likely that low memory flag
> raises again in 5 seconds. In this case memory pressure will not kick GC off
> subsequently[1]. Maybe the sampling rate should be higher.
>
> [1] Unless there are other GC reasons and memory pressure drops. We have to
> be lucky; it is more likely to OOM.
>
> ----- Original Message -----
> From: "Justin Lebar" <[email protected]>
> To: "Ting-Yuan Huang" <[email protected]>
> Cc: [email protected], [email protected], "Thinker K.F. Li"
> <[email protected]>
> Sent: Sunday, April 28, 2013 2:12:51 PM
> Subject: Re: [b2g] Bug 850175
>
>> By the way, low memory warnings can be utilized better. Currently, on B2G, a
>> flag in sysfs is
>> checked every 5 seconds. This obviously is not the best.
>
> That's only when we're in a low-memory state. Otherwise we poll() the
> fd and get notified immediately on a bg thread. But we still have to
> dispatch an event to the main thread to notify it of memory pressure;
> we can't do much from off the main thread.
>
> On Sun, Apr 28, 2013 at 2:04 AM, Ting-Yuan Huang <[email protected]> wrote:
>> Could we made those large strings or arrays copy-on-write? In OS level, we
>> probably can make new pages COW[1]. Or we can implement it in
>> SpiderMonkey[2]. That would not only save memory, but also improve
>> performance I guess.
>>
>> By the way, low memory warnings can be utilized better. Currently, on B2G, a
>> flag in sysfs is checked every 5 seconds. This obviously is not the best.
>> Maybe we could just make low memory killer send singal 35 before SIGKILL.
>> Together with tunning the thresholds, exceptions should happen rarely.
>>
>> [1] I tried to mmap() /proc/$pid/mem but failed; /proc/$pid/mem can't be
>> memory mapped. Some systems seem to have SHM_COPY to shmat(), but Linux
>> seems not. Still trying to find a solution.
>>
>> [2] There should be some performance overheads. Not sure if most of the
>> write-checks can be optimized away by JITs.
>>
>>
>> ----- Original Message -----
>> From: "Justin Lebar" <[email protected]>
>> To: "Thinker K.F. Li" <[email protected]>
>> Cc: [email protected], [email protected], [email protected]
>> Sent: Saturday, April 27, 2013 10:22:34 PM
>> Subject: Re: [b2g] Bug 850175
>>
>>> 3. tirgger scanning for low-memory warning.
>>
>> We have learned not to rely on low-memory warnings. We should still
>> use them, but we should consider them to be an emergency measure which
>> may or may not work.
>>
>> The problem is, often a program allocates too fast to see the
>> low-memory warning. For example, in bug 865929, we have a cache of
>> images that are drawn to canvases. That cache was becoming very large
>> and causing us to crash, so we'd assumed (e.g. in the bug title) that
>> the cache did not listen to memory pressure events.
>>
>> But it turns out, the cache /does/ listen to low-memory events, but we
>> don't act on those events quickly enough to prevent a crash.
>>
>> I expect we can invoke KSM off the main thread, so we could run it
>> sooner than we can run a GC, for example. But still, I don't think we
>> should rely on it.
>>
>> The safest thing to do, I think, is not to copy the string many times.
>>
>> On Sat, Apr 27, 2013 at 2:37 AM, Thinker K.F. Li <[email protected]> wrote:
>>> From: Ting-Yuan Huang <[email protected]>
>>> Subject: Re: Bug 850175
>>> Date: Fri, 26 Apr 2013 21:09:50 -0700 (PDT)
>>>
>>>> KSM requires those strings to be aligned (to same offsets to page
>>>> boundaries). It should be fine in this case, but I'm not quite sure.
>>>
>>> For jemalloc, it is quite sure for big memory allocation. For js
>>> string, Greg told me we can use external string object. I think we
>>> can make sure page alignment at external string object.
>>>
>>>>
>>>> Another characteristic of KSM is that it scans periodically. From the
>>>> discussion on bugzilla it seems that we are suffering from peak memory
>>>> usage. I'm afraid that the original, unmodified KSM can't really help.
>>>> I'll try to find if there are ways in userspace to make duplicated pages
>>>> COW.
>>>
>>> I had looked into the code of KSM. If I am right, we can mark all big
>>> strings after it was created, and trigger KSM to do scaning and merging
>>> for low-memory warning. Then, these big string will be merged at the
>>> time, low-memory warning. Another issue is the number of pages of
>>> scanning is limited. We should pick a good one.
>>>
>>> With following recipe, I guess the big string will be merged in time.
>>> 1. advise big strings after it is created and filled.
>>> 2. perodically trigger scanning by write to /sys/kernel/mm/ksm/run. (opt)
>>> 3. tirgger scanning for low-memory warning.
>>>
>>>>
>>>> ----- Original Message -----
>>>> From: "Thinker K.F. Li" <[email protected]>
>>>> To: [email protected]
>>>> Cc: [email protected]
>>>> Sent: Saturday, April 27, 2013 1:58:10 AM
>>>> Subject: Re: Bug 850175
>>>>
>>>> I had told to Greg. He told me the same string will be duplicated for
>>>> 15 times for inserting to indexedDB. For indexedDb, it had duplicate
>>>> it for at least 6 times. I think KSM can play a good game here.
>>>>
>>>> KSM can play good by advising only big strings or alikes, it play a
>>>> trade-off of overhead and memory. We can trigger it to start scanning
>>>> and merging for low memory.
>>>>
>>>> From: Thinker K.F. Li <[email protected]>
>>>> Subject: Bug 850175
>>>> Date: Sat, 27 Apr 2013 00:20:22 +0800 (CST)
>>>>
>>>>> Hi Ting-Yuan,
>>>>>
>>>>> Tonight, people are talking about bug 850175 on #b2g channel. There
>>>>> are two issues in that bug, one of issues is twitter will create a big
>>>>> string and send it to indexedDB. It causes a lot of string
>>>>> duplications in the peak. Since you are trying the kernel feature of
>>>>> samepage merging, I guess it is a good solution to solve it. We can
>>>>> give advisement only to big strings to reduce loading of scanning.
>>>>> What do you think?
>>>>>
>>>>> see https://bugzilla.mozilla.org/show_bug.cgi?id=850175#c74
>>> _______________________________________________
>>> dev-b2g mailing list
>>> [email protected]
>>> https://lists.mozilla.org/listinfo/dev-b2g
_______________________________________________
dev-b2g mailing list
[email protected]
https://lists.mozilla.org/listinfo/dev-b2g