Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Wed, 4 Feb 2015, Minchan Kim wrote: > > > This is a result of allowing something external (process B) be able to > > > clear hwm so that you never knew the value went to 100MB. That's the > > > definition of a race, I don't know how to explain it any better and making > > > any connection between clearing PG_referenced and mm->hiwater_rss is a > > > stretch. This approach just makes mm->hiwater_rss meaningless. > > > > I understand your concern, but I hope you agree that the functionality we > > are proposing would be very useful for profiling. Therefore, I suggest > > adding an extra resettable field to /proc/pid/status (e.g. > > resettable_hiwater_rss) instead. What is your view on this approach? > > The idea would be very useful for measuring working set size for > efficient memory management in userside, which becomes very popular > with many platforms for embedded world with tight memory. > The problem is the same as the aforementioned if you're only going to be adding one field. If another process happens to clear the resettable_hiwater_rss before you can read it, you don't see potentially large spikes in size. I understand the need for measuring working set size, and we have an in-house solution for that, but I don't think we should be introducing new fields that require only one root process on the system to be touching it for it to be effective. Let me talk with some people about how difficult it would be to propose our in-house solution. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
Hello, On Tue, Feb 03, 2015 at 03:26:28AM +, Petr Cermak wrote: > On Mon, Jan 26, 2015 at 04:00:20PM -0800, David Rientjes wrote: > > [...] > > This is a result of allowing something external (process B) be able to > > clear hwm so that you never knew the value went to 100MB. That's the > > definition of a race, I don't know how to explain it any better and making > > any connection between clearing PG_referenced and mm->hiwater_rss is a > > stretch. This approach just makes mm->hiwater_rss meaningless. > > I understand your concern, but I hope you agree that the functionality we > are proposing would be very useful for profiling. Therefore, I suggest > adding an extra resettable field to /proc/pid/status (e.g. > resettable_hiwater_rss) instead. What is your view on this approach? The idea would be very useful for measuring working set size for efficient memory management in userside, which becomes very popular with many platforms for embedded world with tight memory. > > -- > To unsubscribe, send a message with 'unsubscribe linux-mm' in > the body to majord...@kvack.org. For more info on Linux MM, > see: http://www.linux-mm.org/ . > Don't email: mailto:"d...@kvack.org;> em...@kvack.org -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
Hello, On Tue, Feb 03, 2015 at 03:26:28AM +, Petr Cermak wrote: On Mon, Jan 26, 2015 at 04:00:20PM -0800, David Rientjes wrote: [...] This is a result of allowing something external (process B) be able to clear hwm so that you never knew the value went to 100MB. That's the definition of a race, I don't know how to explain it any better and making any connection between clearing PG_referenced and mm-hiwater_rss is a stretch. This approach just makes mm-hiwater_rss meaningless. I understand your concern, but I hope you agree that the functionality we are proposing would be very useful for profiling. Therefore, I suggest adding an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss) instead. What is your view on this approach? The idea would be very useful for measuring working set size for efficient memory management in userside, which becomes very popular with many platforms for embedded world with tight memory. -- To unsubscribe, send a message with 'unsubscribe linux-mm' in the body to majord...@kvack.org. For more info on Linux MM, see: http://www.linux-mm.org/ . Don't email: a href=mailto:d...@kvack.org; em...@kvack.org /a -- Kind regards, Minchan Kim -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Wed, 4 Feb 2015, Minchan Kim wrote: This is a result of allowing something external (process B) be able to clear hwm so that you never knew the value went to 100MB. That's the definition of a race, I don't know how to explain it any better and making any connection between clearing PG_referenced and mm-hiwater_rss is a stretch. This approach just makes mm-hiwater_rss meaningless. I understand your concern, but I hope you agree that the functionality we are proposing would be very useful for profiling. Therefore, I suggest adding an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss) instead. What is your view on this approach? The idea would be very useful for measuring working set size for efficient memory management in userside, which becomes very popular with many platforms for embedded world with tight memory. The problem is the same as the aforementioned if you're only going to be adding one field. If another process happens to clear the resettable_hiwater_rss before you can read it, you don't see potentially large spikes in size. I understand the need for measuring working set size, and we have an in-house solution for that, but I don't think we should be introducing new fields that require only one root process on the system to be touching it for it to be effective. Let me talk with some people about how difficult it would be to propose our in-house solution. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Mon, Jan 26, 2015 at 04:00:20PM -0800, David Rientjes wrote: > [...] > This is a result of allowing something external (process B) be able to > clear hwm so that you never knew the value went to 100MB. That's the > definition of a race, I don't know how to explain it any better and making > any connection between clearing PG_referenced and mm->hiwater_rss is a > stretch. This approach just makes mm->hiwater_rss meaningless. I understand your concern, but I hope you agree that the functionality we are proposing would be very useful for profiling. Therefore, I suggest adding an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss) instead. What is your view on this approach? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Mon, Jan 26, 2015 at 04:00:20PM -0800, David Rientjes wrote: [...] This is a result of allowing something external (process B) be able to clear hwm so that you never knew the value went to 100MB. That's the definition of a race, I don't know how to explain it any better and making any connection between clearing PG_referenced and mm-hiwater_rss is a stretch. This approach just makes mm-hiwater_rss meaningless. I understand your concern, but I hope you agree that the functionality we are proposing would be very useful for profiling. Therefore, I suggest adding an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss) instead. What is your view on this approach? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Fri, 23 Jan 2015, Primiano Tucci wrote: > > If you reset the hwm for a process, rss grows to 100MB, another process > > resets the hwm, and you see a hwm of 2MB, that invalidates the hwm > > entirely. > > Not sure I follow this scenario. Where does the 2MB come from? It's a random number that the hwm gets reset to after the other process clears it. > How can > you see a hwm of 2MB, under which conditions? HVM can never be < RSS. > Again, what you are talking about is the case of two profilers racing > for using the same interface (hwm). > This is the same case today of the PG_referenced bit. > PG_referenced bit is not tracking the highest rss a process has ever attained. PG_referenced is understood to be clearable at any time and the only guarantee is that it was at least cleared before returning from the write. It could be set again before the write returns as well, but we can be sure that it was at least cleared. With your approach, which completely invalidates the entire purpose of hwm, the following is possible: process A process B - - read hwm = 50MB read hwm = 50MB write to clear hwm rss goes to 100MB write to clear hwm rss goes to 2MB read hwm = 2MB read hwm = 2MB This is a result of allowing something external (process B) be able to clear hwm so that you never knew the value went to 100MB. That's the definition of a race, I don't know how to explain it any better and making any connection between clearing PG_referenced and mm->hiwater_rss is a stretch. This approach just makes mm->hiwater_rss meaningless. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Fri, 23 Jan 2015, Primiano Tucci wrote: If you reset the hwm for a process, rss grows to 100MB, another process resets the hwm, and you see a hwm of 2MB, that invalidates the hwm entirely. Not sure I follow this scenario. Where does the 2MB come from? It's a random number that the hwm gets reset to after the other process clears it. How can you see a hwm of 2MB, under which conditions? HVM can never be RSS. Again, what you are talking about is the case of two profilers racing for using the same interface (hwm). This is the same case today of the PG_referenced bit. PG_referenced bit is not tracking the highest rss a process has ever attained. PG_referenced is understood to be clearable at any time and the only guarantee is that it was at least cleared before returning from the write. It could be set again before the write returns as well, but we can be sure that it was at least cleared. With your approach, which completely invalidates the entire purpose of hwm, the following is possible: process A process B - - read hwm = 50MB read hwm = 50MB write to clear hwm rss goes to 100MB write to clear hwm rss goes to 2MB read hwm = 2MB read hwm = 2MB This is a result of allowing something external (process B) be able to clear hwm so that you never knew the value went to 100MB. That's the definition of a race, I don't know how to explain it any better and making any connection between clearing PG_referenced and mm-hiwater_rss is a stretch. This approach just makes mm-hiwater_rss meaningless. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Thu, Jan 22, 2015 at 11:27 PM, David Rientjes wrote: > If you reset the hwm for a process, rss grows to 100MB, another process > resets the hwm, and you see a hwm of 2MB, that invalidates the hwm > entirely. Not sure I follow this scenario. Where does the 2MB come from? How can you see a hwm of 2MB, under which conditions? HVM can never be < RSS. Again, what you are talking about is the case of two profilers racing for using the same interface (hwm). This is the same case today of the PG_referenced bit. > The hwm is already defined as the > highest rss the process has attained, resetting it and trying to make any > inference from the result is racy and invalidates the actual value which > is useful. The counter arugment is: once you have one very high peak, the hvm becomes essentially useless for the rest of the lifetime of the process (until a higher peak comes). This makes very hard to understand what is going on in the meanwhile (from userspace). Anyways, are you proposing to pursue a different approach? Is the approach 2. that petrcermark@ proposed in the beginning of the thread going to address this concern? -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Thu, 22 Jan 2015, Primiano Tucci wrote: > > I think the bigger concern would be that this, and any new line such as > > resettable_hiwater_rss, invalidates itself entirely. Any process that > > checks the hwm will not know of other processes that reset it, so the > > value itself has no significance anymore. > > It would just be the mark since the last clear at an unknown time. > > How is that different from the current logic of clear_refs and the > corresponding PG_Referenced bit? > If you reset the hwm for a process, rss grows to 100MB, another process resets the hwm, and you see a hwm of 2MB, that invalidates the hwm entirely. That's especially true if there's an oom condition that kills a process when the rss grew to 100MB but you see a hwm of 2MB and don't believe it was possibly the culprit. The hwm is already defined as the highest rss the process has attained, resetting it and trying to make any inference from the result is racy and invalidates the actual value which is useful. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Thu, 22 Jan 2015, Primiano Tucci wrote: I think the bigger concern would be that this, and any new line such as resettable_hiwater_rss, invalidates itself entirely. Any process that checks the hwm will not know of other processes that reset it, so the value itself has no significance anymore. It would just be the mark since the last clear at an unknown time. How is that different from the current logic of clear_refs and the corresponding PG_Referenced bit? If you reset the hwm for a process, rss grows to 100MB, another process resets the hwm, and you see a hwm of 2MB, that invalidates the hwm entirely. That's especially true if there's an oom condition that kills a process when the rss grew to 100MB but you see a hwm of 2MB and don't believe it was possibly the culprit. The hwm is already defined as the highest rss the process has attained, resetting it and trying to make any inference from the result is racy and invalidates the actual value which is useful. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Thu, Jan 22, 2015 at 11:27 PM, David Rientjes rient...@google.com wrote: If you reset the hwm for a process, rss grows to 100MB, another process resets the hwm, and you see a hwm of 2MB, that invalidates the hwm entirely. Not sure I follow this scenario. Where does the 2MB come from? How can you see a hwm of 2MB, under which conditions? HVM can never be RSS. Again, what you are talking about is the case of two profilers racing for using the same interface (hwm). This is the same case today of the PG_referenced bit. The hwm is already defined as the highest rss the process has attained, resetting it and trying to make any inference from the result is racy and invalidates the actual value which is useful. The counter arugment is: once you have one very high peak, the hvm becomes essentially useless for the rest of the lifetime of the process (until a higher peak comes). This makes very hard to understand what is going on in the meanwhile (from userspace). Anyways, are you proposing to pursue a different approach? Is the approach 2. that petrcermark@ proposed in the beginning of the thread going to address this concern? -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Wed, Jan 21, 2015 at 10:58 PM, David Rientjes wrote: > I think the bigger concern would be that this, and any new line such as > resettable_hiwater_rss, invalidates itself entirely. Any process that > checks the hwm will not know of other processes that reset it, so the > value itself has no significance anymore. > It would just be the mark since the last clear at an unknown time. How is that different from the current logic of clear_refs and the corresponding PG_Referenced bit? > Userspace can monitor the rss of a > process by reading /proc/pid/stat, there's no need for the kernel to do > something that userspace can do. I disagree here. The driving motivation of this patch is precisely the opposite. There are peak events that last for very short time (order: 10-100 ms) and are practically invisible from user-space (even doing something awkward like polling in a tight loop). Concrete examples are: GPU memory transfers, image decoding, compression / decompression. These kinds of tasks, which use scratch buffers for few ms, can create significant (yet short lasting) memory pressure which is desirable to monitor. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Thu, 15 Jan 2015, Kirill A. Shutemov wrote: > I'm not sure if it should be considered ABI break or not. Just asking. > > I would like to hear opinion from other people. > I think the bigger concern would be that this, and any new line such as resettable_hiwater_rss, invalidates itself entirely. Any process that checks the hwm will not know of other processes that reset it, so the value itself has no significance anymore. It would just be the mark since the last clear at an unknown time. Userspace can monitor the rss of a process by reading /proc/pid/stat, there's no need for the kernel to do something that userspace can do. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Wed, Jan 21, 2015 at 10:58 PM, David Rientjes rient...@google.com wrote: I think the bigger concern would be that this, and any new line such as resettable_hiwater_rss, invalidates itself entirely. Any process that checks the hwm will not know of other processes that reset it, so the value itself has no significance anymore. It would just be the mark since the last clear at an unknown time. How is that different from the current logic of clear_refs and the corresponding PG_Referenced bit? Userspace can monitor the rss of a process by reading /proc/pid/stat, there's no need for the kernel to do something that userspace can do. I disagree here. The driving motivation of this patch is precisely the opposite. There are peak events that last for very short time (order: 10-100 ms) and are practically invisible from user-space (even doing something awkward like polling in a tight loop). Concrete examples are: GPU memory transfers, image decoding, compression / decompression. These kinds of tasks, which use scratch buffers for few ms, can create significant (yet short lasting) memory pressure which is desirable to monitor. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Thu, 15 Jan 2015, Kirill A. Shutemov wrote: I'm not sure if it should be considered ABI break or not. Just asking. I would like to hear opinion from other people. I think the bigger concern would be that this, and any new line such as resettable_hiwater_rss, invalidates itself entirely. Any process that checks the hwm will not know of other processes that reset it, so the value itself has no significance anymore. It would just be the mark since the last clear at an unknown time. Userspace can monitor the rss of a process by reading /proc/pid/stat, there's no need for the kernel to do something that userspace can do. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Thu, Jan 15, 2015 at 01:36:30AM +0200, Kirill A. Shutemov wrote: > On Wed, Jan 14, 2015 at 03:22:25PM +, Petr Cermak wrote: > > On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: > > > And how it's not an ABI break? > > I don't think this is an ABI break because the current behaviour is not > > changed unless you write "5" to /proc/pid/clear_refs. If you do, you are > > explicitly requesting the new functionality. > > I'm not sure if it should be considered ABI break or not. Just asking. > > I would like to hear opinion from other people. We would like to get more feedback as well. > > > We have never-lowering VmHWM for 9+ years. How can you know that nobody > > > expects this behaviour? > > This is why we sent an RFC [1] several weeks ago. We expect this to be > > used mainly by performance-related tools (e.g. profilers) and from the > > comments in the code [2] VmHWM seems to be a best-effort counter. If this > > is strictly a no-go, I can only think of the following two alternatives: > > > > 1. Add an extra resettable field to /proc/pid/status (e.g. > > resettable_hiwater_rss). While this doesn't violate the current > > definition of VmHWM, it adds an extra line to /proc/pid/status, > > which I think is a much bigger issue. > > I don't think extra line is bigger issue. Sane applications would look for > a key, not line number. We do add lines there. I've posted patch which > adds one more just today ;) In that case, should we add an extra field to /proc/pid/status? > > 2. Introduce a new proc fs file to task_mmu (e.g. > > /proc/pid/profiler_stats), but this feels like overengineering. > > BTW, we have memory.max_usage_in_byte in memory cgroup. And it's resetable. > Wouldn't it be enough for your profiling use-case? This is a very interesting point, but it doesn't cover all use cases. Our specific use case is memory profiling of the Chromium browser, but I think that the same considerations below apply to any other use case which involves child sub-processes: 1. The process must be added to the control group explicitly by root. Hence, the Chromium process cannot do this itself. 2. All forked children are implicitly grouped in the same control group. This is a problem because we want to be able to measure memory usage of the child processes separately. The advantage of using clear_refs instead is that Chomium would be able to profile its memory usage without any external intervention (as it's already the case with performance profiling). > > > And why do you reset hiwater_rss, but not hiwater_vm? > > This is a good point. Should we reset both using the same flag, or > > introduce a new one ("6")? > > > > [1] lkml.iu.edu/hypermail/linux/kernel/1412.1/01877.html > > [2] task_mmu.c:32: "... such snapshots can always be inconsistent." Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Thu, Jan 15, 2015 at 01:36:30AM +0200, Kirill A. Shutemov wrote: On Wed, Jan 14, 2015 at 03:22:25PM +, Petr Cermak wrote: On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: And how it's not an ABI break? I don't think this is an ABI break because the current behaviour is not changed unless you write 5 to /proc/pid/clear_refs. If you do, you are explicitly requesting the new functionality. I'm not sure if it should be considered ABI break or not. Just asking. I would like to hear opinion from other people. We would like to get more feedback as well. We have never-lowering VmHWM for 9+ years. How can you know that nobody expects this behaviour? This is why we sent an RFC [1] several weeks ago. We expect this to be used mainly by performance-related tools (e.g. profilers) and from the comments in the code [2] VmHWM seems to be a best-effort counter. If this is strictly a no-go, I can only think of the following two alternatives: 1. Add an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss). While this doesn't violate the current definition of VmHWM, it adds an extra line to /proc/pid/status, which I think is a much bigger issue. I don't think extra line is bigger issue. Sane applications would look for a key, not line number. We do add lines there. I've posted patch which adds one more just today ;) In that case, should we add an extra field to /proc/pid/status? 2. Introduce a new proc fs file to task_mmu (e.g. /proc/pid/profiler_stats), but this feels like overengineering. BTW, we have memory.max_usage_in_byte in memory cgroup. And it's resetable. Wouldn't it be enough for your profiling use-case? This is a very interesting point, but it doesn't cover all use cases. Our specific use case is memory profiling of the Chromium browser, but I think that the same considerations below apply to any other use case which involves child sub-processes: 1. The process must be added to the control group explicitly by root. Hence, the Chromium process cannot do this itself. 2. All forked children are implicitly grouped in the same control group. This is a problem because we want to be able to measure memory usage of the child processes separately. The advantage of using clear_refs instead is that Chomium would be able to profile its memory usage without any external intervention (as it's already the case with performance profiling). And why do you reset hiwater_rss, but not hiwater_vm? This is a good point. Should we reset both using the same flag, or introduce a new one (6)? [1] lkml.iu.edu/hypermail/linux/kernel/1412.1/01877.html [2] task_mmu.c:32: ... such snapshots can always be inconsistent. Petr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Wed, Jan 14, 2015 at 03:22:25PM +, Petr Cermak wrote: > On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: > > And how it's not an ABI break? > I don't think this is an ABI break because the current behaviour is not > changed unless you write "5" to /proc/pid/clear_refs. If you do, you are > explicitly requesting the new functionality. > > > We have never-lowering VmHWM for 9+ years. How can you know that nobody > > expects this behaviour? > This is why we sent an RFC [1] several weeks ago. We expect this to be > used mainly by performance-related tools (e.g. profilers) and from the > comments in the code [2] VmHWM seems to be a best-effort counter. If this > is strictly a no-go, I can only think of the following two alternatives: > > 1. Add an extra resettable field to /proc/pid/status (e.g. > resettable_hiwater_rss). While this doesn't violate the current > definition of VmHWM, it adds an extra line to /proc/pid/status, > which I think is a much bigger issue. > 2. Introduce a new proc fs file to task_mmu (e.g. > /proc/pid/profiler_stats), but this feels like overengineering. BTW, we have memory.max_usage_in_byte in memory cgroup. And it's resetable. Wouldn't it be enough for your profiling use-case? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Wed, Jan 14, 2015 at 03:22:25PM +, Petr Cermak wrote: > On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: > > And how it's not an ABI break? > I don't think this is an ABI break because the current behaviour is not > changed unless you write "5" to /proc/pid/clear_refs. If you do, you are > explicitly requesting the new functionality. I'm not sure if it should be considered ABI break or not. Just asking. I would like to hear opinion from other people. > > We have never-lowering VmHWM for 9+ years. How can you know that nobody > > expects this behaviour? > This is why we sent an RFC [1] several weeks ago. We expect this to be > used mainly by performance-related tools (e.g. profilers) and from the > comments in the code [2] VmHWM seems to be a best-effort counter. If this > is strictly a no-go, I can only think of the following two alternatives: > > 1. Add an extra resettable field to /proc/pid/status (e.g. > resettable_hiwater_rss). While this doesn't violate the current > definition of VmHWM, it adds an extra line to /proc/pid/status, > which I think is a much bigger issue. I don't think extra line is bigger issue. Sane applications would look for a key, not line number. We do add lines there. I've posted patch which adds one more just today ;) > 2. Introduce a new proc fs file to task_mmu (e.g. > /proc/pid/profiler_stats), but this feels like overengineering. > > > And why do you reset hiwater_rss, but not hiwater_vm? > This is a good point. Should we reset both using the same flag, or > introduce a new one ("6")? > > [1] lkml.iu.edu/hypermail/linux/kernel/1412.1/01877.html > [2] task_mmu.c:32: "... such snapshots can always be inconsistent." -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: > And how it's not an ABI break? I don't think this is an ABI break because the current behaviour is not changed unless you write "5" to /proc/pid/clear_refs. If you do, you are explicitly requesting the new functionality. > We have never-lowering VmHWM for 9+ years. How can you know that nobody > expects this behaviour? This is why we sent an RFC [1] several weeks ago. We expect this to be used mainly by performance-related tools (e.g. profilers) and from the comments in the code [2] VmHWM seems to be a best-effort counter. If this is strictly a no-go, I can only think of the following two alternatives: 1. Add an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss). While this doesn't violate the current definition of VmHWM, it adds an extra line to /proc/pid/status, which I think is a much bigger issue. 2. Introduce a new proc fs file to task_mmu (e.g. /proc/pid/profiler_stats), but this feels like overengineering. > And why do you reset hiwater_rss, but not hiwater_vm? This is a good point. Should we reset both using the same flag, or introduce a new one ("6")? [1] lkml.iu.edu/hypermail/linux/kernel/1412.1/01877.html [2] task_mmu.c:32: "... such snapshots can always be inconsistent." Petr -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Wed, Jan 14, 2015 at 03:22:25PM +, Petr Cermak wrote: On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: And how it's not an ABI break? I don't think this is an ABI break because the current behaviour is not changed unless you write 5 to /proc/pid/clear_refs. If you do, you are explicitly requesting the new functionality. I'm not sure if it should be considered ABI break or not. Just asking. I would like to hear opinion from other people. We have never-lowering VmHWM for 9+ years. How can you know that nobody expects this behaviour? This is why we sent an RFC [1] several weeks ago. We expect this to be used mainly by performance-related tools (e.g. profilers) and from the comments in the code [2] VmHWM seems to be a best-effort counter. If this is strictly a no-go, I can only think of the following two alternatives: 1. Add an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss). While this doesn't violate the current definition of VmHWM, it adds an extra line to /proc/pid/status, which I think is a much bigger issue. I don't think extra line is bigger issue. Sane applications would look for a key, not line number. We do add lines there. I've posted patch which adds one more just today ;) 2. Introduce a new proc fs file to task_mmu (e.g. /proc/pid/profiler_stats), but this feels like overengineering. And why do you reset hiwater_rss, but not hiwater_vm? This is a good point. Should we reset both using the same flag, or introduce a new one (6)? [1] lkml.iu.edu/hypermail/linux/kernel/1412.1/01877.html [2] task_mmu.c:32: ... such snapshots can always be inconsistent. -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Wed, Jan 14, 2015 at 03:22:25PM +, Petr Cermak wrote: On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: And how it's not an ABI break? I don't think this is an ABI break because the current behaviour is not changed unless you write 5 to /proc/pid/clear_refs. If you do, you are explicitly requesting the new functionality. We have never-lowering VmHWM for 9+ years. How can you know that nobody expects this behaviour? This is why we sent an RFC [1] several weeks ago. We expect this to be used mainly by performance-related tools (e.g. profilers) and from the comments in the code [2] VmHWM seems to be a best-effort counter. If this is strictly a no-go, I can only think of the following two alternatives: 1. Add an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss). While this doesn't violate the current definition of VmHWM, it adds an extra line to /proc/pid/status, which I think is a much bigger issue. 2. Introduce a new proc fs file to task_mmu (e.g. /proc/pid/profiler_stats), but this feels like overengineering. BTW, we have memory.max_usage_in_byte in memory cgroup. And it's resetable. Wouldn't it be enough for your profiling use-case? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Wed, Jan 07, 2015 at 07:24:52PM +0200, Kirill A. Shutemov wrote: And how it's not an ABI break? I don't think this is an ABI break because the current behaviour is not changed unless you write 5 to /proc/pid/clear_refs. If you do, you are explicitly requesting the new functionality. We have never-lowering VmHWM for 9+ years. How can you know that nobody expects this behaviour? This is why we sent an RFC [1] several weeks ago. We expect this to be used mainly by performance-related tools (e.g. profilers) and from the comments in the code [2] VmHWM seems to be a best-effort counter. If this is strictly a no-go, I can only think of the following two alternatives: 1. Add an extra resettable field to /proc/pid/status (e.g. resettable_hiwater_rss). While this doesn't violate the current definition of VmHWM, it adds an extra line to /proc/pid/status, which I think is a much bigger issue. 2. Introduce a new proc fs file to task_mmu (e.g. /proc/pid/profiler_stats), but this feels like overengineering. And why do you reset hiwater_rss, but not hiwater_vm? This is a good point. Should we reset both using the same flag, or introduce a new one (6)? [1] lkml.iu.edu/hypermail/linux/kernel/1412.1/01877.html [2] task_mmu.c:32: ... such snapshots can always be inconsistent. Petr -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm->hiwater_rss (peak RSS)
On Wed, Jan 07, 2015 at 05:06:54PM +, Petr Cermak wrote: > Peak resident size of a process can be reset by writing "5" to > /proc/pid/clear_refs. The driving use-case for this would be getting the > peak RSS value, which can be retrieved from the VmHWM field in > /proc/pid/status, per benchmark iteration or test scenario. And how it's not an ABI break? We have never-lowering VmHWM for 9+ years. How can you know that nobody expects this behaviour? And why do you reset hiwater_rss, but not hiwater_vm? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH v2 2/2] task_mmu: Add user-space support for resetting mm-hiwater_rss (peak RSS)
On Wed, Jan 07, 2015 at 05:06:54PM +, Petr Cermak wrote: Peak resident size of a process can be reset by writing 5 to /proc/pid/clear_refs. The driving use-case for this would be getting the peak RSS value, which can be retrieved from the VmHWM field in /proc/pid/status, per benchmark iteration or test scenario. And how it's not an ABI break? We have never-lowering VmHWM for 9+ years. How can you know that nobody expects this behaviour? And why do you reset hiwater_rss, but not hiwater_vm? -- Kirill A. Shutemov -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/