The most recent out-of-memory situations we had were all non-JS and non-GC related. They were not even CC related. It was basically all caches that would hold on to too much memory too quickly, and before the notification comes in, the process dies. I think the key strategy is to make sure that every cache is bounded to a reasonable value. We limited eager image decoding to the first 24 images, and the canvas image cache to 10MB. That gives the out of memory notification a chance to kick in in a low-but-survivable state, instead of arriving too late because we tried to hold on to hundreds of megabytes of image data.
Andreas On Apr 28, 2013, at 10:07 AM, Ting-Yuan Huang <[email protected]> wrote: >> 5s may not be the right value, but it's very tricky to set correctly. >> Every process will wake up and run code every X seconds under memory >> pressure. If we set X too low, we'll spend most of our time flushing >> caches and never do any useful work! > > Or we can make memory pressure a hint, and let garbage collector itself > decide whether to > go or not. For example, if the heap size grew very little, say only 5%, when > compared to > last GC, the collector may decide not to kick in. This should be the most > common case to > background processes. Only foreground process repeatedly GC and performance > should be fine. > > Current dispatcher hides information; It filters out some critical timings > that GC should > kick in. I agree this may not always work in time and we should not rely on > it, but it can > be tuned and is feasible most of the time. > > ----- Original Message ----- > From: "Justin Lebar" <[email protected]> > To: "Ting-Yuan Huang" <[email protected]> > Cc: [email protected], [email protected], "Thinker K.F. Li" > <[email protected]> > Sent: 2013年4月28日 星期日 23:13:43 > Subject: Re: [b2g] Bug 850175 > >> But when memory are quickly allocated, it is likely that low memory flag >> raises again in 5 >> seconds. In this case memory pressure will not kick GC off subsequently[1]. >> Maybe the >> sampling rate should be higher. > > 5s may not be the right value, but it's very tricky to set correctly. > Every process will wake up and run code every X seconds under memory > pressure. If we set X too low, we'll spend most of our time flushing > caches and never do any useful work! > > This is why I keep saying that memory pressure is an emergency > measure, and that we shouldn't rely on it. > > On Sun, Apr 28, 2013 at 2:59 AM, Ting-Yuan Huang <[email protected]> wrote: >> Oh, I see. I should read the codes more carefully :( >> >> But when memory are quickly allocated, it is likely that low memory flag >> raises again in 5 seconds. In this case memory pressure will not kick GC off >> subsequently[1]. Maybe the sampling rate should be higher. >> >> [1] Unless there are other GC reasons and memory pressure drops. We have to >> be lucky; it is more likely to OOM. >> >> ----- Original Message ----- >> From: "Justin Lebar" <[email protected]> >> To: "Ting-Yuan Huang" <[email protected]> >> Cc: [email protected], [email protected], "Thinker K.F. Li" >> <[email protected]> >> Sent: Sunday, April 28, 2013 2:12:51 PM >> Subject: Re: [b2g] Bug 850175 >> >>> By the way, low memory warnings can be utilized better. Currently, on B2G, >>> a flag in sysfs is >>> checked every 5 seconds. This obviously is not the best. >> >> That's only when we're in a low-memory state. Otherwise we poll() the >> fd and get notified immediately on a bg thread. But we still have to >> dispatch an event to the main thread to notify it of memory pressure; >> we can't do much from off the main thread. >> >> On Sun, Apr 28, 2013 at 2:04 AM, Ting-Yuan Huang <[email protected]> wrote: >>> Could we made those large strings or arrays copy-on-write? In OS level, we >>> probably can make new pages COW[1]. Or we can implement it in >>> SpiderMonkey[2]. That would not only save memory, but also improve >>> performance I guess. >>> >>> By the way, low memory warnings can be utilized better. Currently, on B2G, >>> a flag in sysfs is checked every 5 seconds. This obviously is not the best. >>> Maybe we could just make low memory killer send singal 35 before SIGKILL. >>> Together with tunning the thresholds, exceptions should happen rarely. >>> >>> [1] I tried to mmap() /proc/$pid/mem but failed; /proc/$pid/mem can't be >>> memory mapped. Some systems seem to have SHM_COPY to shmat(), but Linux >>> seems not. Still trying to find a solution. >>> >>> [2] There should be some performance overheads. Not sure if most of the >>> write-checks can be optimized away by JITs. >>> >>> >>> ----- Original Message ----- >>> From: "Justin Lebar" <[email protected]> >>> To: "Thinker K.F. Li" <[email protected]> >>> Cc: [email protected], [email protected], [email protected] >>> Sent: Saturday, April 27, 2013 10:22:34 PM >>> Subject: Re: [b2g] Bug 850175 >>> >>>> 3. tirgger scanning for low-memory warning. >>> >>> We have learned not to rely on low-memory warnings. We should still >>> use them, but we should consider them to be an emergency measure which >>> may or may not work. >>> >>> The problem is, often a program allocates too fast to see the >>> low-memory warning. For example, in bug 865929, we have a cache of >>> images that are drawn to canvases. That cache was becoming very large >>> and causing us to crash, so we'd assumed (e.g. in the bug title) that >>> the cache did not listen to memory pressure events. >>> >>> But it turns out, the cache /does/ listen to low-memory events, but we >>> don't act on those events quickly enough to prevent a crash. >>> >>> I expect we can invoke KSM off the main thread, so we could run it >>> sooner than we can run a GC, for example. But still, I don't think we >>> should rely on it. >>> >>> The safest thing to do, I think, is not to copy the string many times. >>> >>> On Sat, Apr 27, 2013 at 2:37 AM, Thinker K.F. Li <[email protected]> >>> wrote: >>>> From: Ting-Yuan Huang <[email protected]> >>>> Subject: Re: Bug 850175 >>>> Date: Fri, 26 Apr 2013 21:09:50 -0700 (PDT) >>>> >>>>> KSM requires those strings to be aligned (to same offsets to page >>>>> boundaries). It should be fine in this case, but I'm not quite sure. >>>> >>>> For jemalloc, it is quite sure for big memory allocation. For js >>>> string, Greg told me we can use external string object. I think we >>>> can make sure page alignment at external string object. >>>> >>>>> >>>>> Another characteristic of KSM is that it scans periodically. From the >>>>> discussion on bugzilla it seems that we are suffering from peak memory >>>>> usage. I'm afraid that the original, unmodified KSM can't really help. >>>>> I'll try to find if there are ways in userspace to make duplicated pages >>>>> COW. >>>> >>>> I had looked into the code of KSM. If I am right, we can mark all big >>>> strings after it was created, and trigger KSM to do scaning and merging >>>> for low-memory warning. Then, these big string will be merged at the >>>> time, low-memory warning. Another issue is the number of pages of >>>> scanning is limited. We should pick a good one. >>>> >>>> With following recipe, I guess the big string will be merged in time. >>>> 1. advise big strings after it is created and filled. >>>> 2. perodically trigger scanning by write to /sys/kernel/mm/ksm/run. (opt) >>>> 3. tirgger scanning for low-memory warning. >>>> >>>>> >>>>> ----- Original Message ----- >>>>> From: "Thinker K.F. Li" <[email protected]> >>>>> To: [email protected] >>>>> Cc: [email protected] >>>>> Sent: Saturday, April 27, 2013 1:58:10 AM >>>>> Subject: Re: Bug 850175 >>>>> >>>>> I had told to Greg. He told me the same string will be duplicated for >>>>> 15 times for inserting to indexedDB. For indexedDb, it had duplicate >>>>> it for at least 6 times. I think KSM can play a good game here. >>>>> >>>>> KSM can play good by advising only big strings or alikes, it play a >>>>> trade-off of overhead and memory. We can trigger it to start scanning >>>>> and merging for low memory. >>>>> >>>>> From: Thinker K.F. Li <[email protected]> >>>>> Subject: Bug 850175 >>>>> Date: Sat, 27 Apr 2013 00:20:22 +0800 (CST) >>>>> >>>>>> Hi Ting-Yuan, >>>>>> >>>>>> Tonight, people are talking about bug 850175 on #b2g channel. There >>>>>> are two issues in that bug, one of issues is twitter will create a big >>>>>> string and send it to indexedDB. It causes a lot of string >>>>>> duplications in the peak. Since you are trying the kernel feature of >>>>>> samepage merging, I guess it is a good solution to solve it. We can >>>>>> give advisement only to big strings to reduce loading of scanning. >>>>>> What do you think? >>>>>> >>>>>> see https://bugzilla.mozilla.org/show_bug.cgi?id=850175#c74 >>>> _______________________________________________ >>>> dev-b2g mailing list >>>> [email protected] >>>> https://lists.mozilla.org/listinfo/dev-b2g > _______________________________________________ > dev-b2g mailing list > [email protected] > https://lists.mozilla.org/listinfo/dev-b2g _______________________________________________ dev-b2g mailing list [email protected] https://lists.mozilla.org/listinfo/dev-b2g
