Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Fri, 2014-06-27 at 12:08 +0300, Artem Bityutskiy wrote: > To make 100% sure you'd not only need to drop VFS-level caches but also > file-system-level caches. Indeed, file-systems have their own rather Sorry, I wanted to say "rather complex" here > buffers for different indexing data-structures, etc. The unmount/mount > sequence takes care of that. > -- Best Regards, Artem Bityutskiy - Intel Finland Oy Registered Address: PL 281, 00181 Helsinki Business Identity Code: 0357606 - 4 Domiciled in Helsinki This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Fri, 2014-06-27 at 10:41 +0200, Matthias Schniedermeyer wrote: > On 26.06.2014 13:57, Luká? Czerner wrote: > > > > So if the authors want to sell this new interface (in whatever form) to > > > the kernel community, they should start with providing a solid use-case, > > > with some more details, explore alternatives and show how the > > > alternatives do not work for them. > > > > Yes please, let's see some solid use-case for this. > > Personally i would want it to verify files after copying them: > Especially while moving files: > - Copy a file > - > - Verify that it really is correct on stable storage > - Remove original file To make 100% sure you'd not only need to drop VFS-level caches but also file-system-level caches. Indeed, file-systems have their own rather buffers for different indexing data-structures, etc. The unmount/mount sequence takes care of that. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Fri, 27 Jun 2014, Matthias Schniedermeyer wrote: > Date: Fri, 27 Jun 2014 10:41:39 +0200 > From: Matthias Schniedermeyer > To: Luká? Czerner > Cc: Artem Bityutskiy , > Bernd Schubert , > Dave Chinner , Thomas Knauth , > David Rientjes , > Maksym Planeta , > Alexander Viro , linux-fsde...@vger.kernel.org, > linux-kernel@vger.kernel.org > Subject: Re: [PATCH] sysctl: Add a feature to drop caches selectively > > On 26.06.2014 13:57, Luká? Czerner wrote: > > > > So if the authors want to sell this new interface (in whatever form) to > > > the kernel community, they should start with providing a solid use-case, > > > with some more details, explore alternatives and show how the > > > alternatives do not work for them. > > > > Yes please, let's see some solid use-case for this. > > Personally i would want it to verify files after copying them: > Especially while moving files: > - Copy a file > - > - Verify that it really is correct on stable storage > - Remove original file I assume you're using cp to copy a file, not your own program. In that case can we make cp optionally use direct io ? It seems that it would solve your problem in very elegant way. -Lukas > > Currently i choose either of the 3 ways: > - drop_caches > - umount/mount > - Write more data than memory in machine (Which is only an > approximnation and you have to verify in the same order the files were > written, so it is likely that any cache was thrashed in the meantime) > > But having a way to selectivly "destory" the cache of a file would make > this task easier. > > > > >
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 06/27/2014 04:55 AM, Dave Chinner wrote: On Thu, Jun 26, 2014 at 02:10:28PM +0200, Bernd Schubert wrote: On 06/26/2014 01:57 PM, Lukáš Czerner wrote: On Thu, 26 Jun 2014, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. I'm not sure if a benchmark really needs that so much that FADV_DONTNEED isn't sufficient. Personally I would also like to know if FADV_DONTNEED succeeded. I.e. 'ql-fstest' is to check if the written pattern went to the block device and currently it does not know if data really had been dropped from the page cache. As it reads files several times this is not critical, but only would be a nice to have - nothing worth to add a new syscall. ql-test is not a benchmark, it's a data integrity test. The re-read verification problem is easily solved by using direct IO to read the files directly without going through the page cache. Indeed, direct IO will invalidate cached pages over the range it reads before it does the read, so the guarantee that you are after - no cached pages when the read is done - is also fulfilled by the direct IO read... I really don't understand why people keep trying to make cached IO behave like uncached IO when we already have uncached IO interfaces Firstly, direct IO has an entirely different IO pattern, usually much simpler than buffered through the page cache. Secondly, going through the page cache ensures that page cache buffering is also tested. I'm not at all opposed to open files randomly with direct IO to also test that path and I'm going to add that soon, but only using direct IO would limit the use case of ql-fstest. Bernd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 26.06.2014 13:57, Luká? Czerner wrote: > > So if the authors want to sell this new interface (in whatever form) to > > the kernel community, they should start with providing a solid use-case, > > with some more details, explore alternatives and show how the > > alternatives do not work for them. > > Yes please, let's see some solid use-case for this. Personally i would want it to verify files after copying them: Especially while moving files: - Copy a file - - Verify that it really is correct on stable storage - Remove original file Currently i choose either of the 3 ways: - drop_caches - umount/mount - Write more data than memory in machine (Which is only an approximnation and you have to verify in the same order the files were written, so it is likely that any cache was thrashed in the meantime) But having a way to selectivly "destory" the cache of a file would make this task easier. -- Matthias -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 26.06.2014 13:57, Luká? Czerner wrote: So if the authors want to sell this new interface (in whatever form) to the kernel community, they should start with providing a solid use-case, with some more details, explore alternatives and show how the alternatives do not work for them. Yes please, let's see some solid use-case for this. Personally i would want it to verify files after copying them: Especially while moving files: - Copy a file - drop cache - Verify that it really is correct on stable storage - Remove original file Currently i choose either of the 3 ways: - drop_caches - umount/mount - Write more data than memory in machine (Which is only an approximnation and you have to verify in the same order the files were written, so it is likely that any cache was thrashed in the meantime) But having a way to selectivly destory the cache of a file would make this task easier. -- Matthias -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 06/27/2014 04:55 AM, Dave Chinner wrote: On Thu, Jun 26, 2014 at 02:10:28PM +0200, Bernd Schubert wrote: On 06/26/2014 01:57 PM, Lukáš Czerner wrote: On Thu, 26 Jun 2014, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. I'm not sure if a benchmark really needs that so much that FADV_DONTNEED isn't sufficient. Personally I would also like to know if FADV_DONTNEED succeeded. I.e. 'ql-fstest' is to check if the written pattern went to the block device and currently it does not know if data really had been dropped from the page cache. As it reads files several times this is not critical, but only would be a nice to have - nothing worth to add a new syscall. ql-test is not a benchmark, it's a data integrity test. The re-read verification problem is easily solved by using direct IO to read the files directly without going through the page cache. Indeed, direct IO will invalidate cached pages over the range it reads before it does the read, so the guarantee that you are after - no cached pages when the read is done - is also fulfilled by the direct IO read... I really don't understand why people keep trying to make cached IO behave like uncached IO when we already have uncached IO interfaces Firstly, direct IO has an entirely different IO pattern, usually much simpler than buffered through the page cache. Secondly, going through the page cache ensures that page cache buffering is also tested. I'm not at all opposed to open files randomly with direct IO to also test that path and I'm going to add that soon, but only using direct IO would limit the use case of ql-fstest. Bernd -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Fri, 27 Jun 2014, Matthias Schniedermeyer wrote: Date: Fri, 27 Jun 2014 10:41:39 +0200 From: Matthias Schniedermeyer m...@citd.de To: Luká? Czerner lczer...@redhat.com Cc: Artem Bityutskiy dedeki...@gmail.com, Bernd Schubert bernd.schub...@itwm.fraunhofer.de, Dave Chinner da...@fromorbit.com, Thomas Knauth thomas.kna...@gmx.de, David Rientjes rient...@google.com, Maksym Planeta mcsim.plan...@gmail.com, Alexander Viro v...@zeniv.linux.org.uk, linux-fsde...@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] sysctl: Add a feature to drop caches selectively On 26.06.2014 13:57, Luká? Czerner wrote: So if the authors want to sell this new interface (in whatever form) to the kernel community, they should start with providing a solid use-case, with some more details, explore alternatives and show how the alternatives do not work for them. Yes please, let's see some solid use-case for this. Personally i would want it to verify files after copying them: Especially while moving files: - Copy a file - drop cache - Verify that it really is correct on stable storage - Remove original file I assume you're using cp to copy a file, not your own program. In that case can we make cp optionally use direct io ? It seems that it would solve your problem in very elegant way. -Lukas Currently i choose either of the 3 ways: - drop_caches - umount/mount - Write more data than memory in machine (Which is only an approximnation and you have to verify in the same order the files were written, so it is likely that any cache was thrashed in the meantime) But having a way to selectivly destory the cache of a file would make this task easier.
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Fri, 2014-06-27 at 10:41 +0200, Matthias Schniedermeyer wrote: On 26.06.2014 13:57, Luká? Czerner wrote: So if the authors want to sell this new interface (in whatever form) to the kernel community, they should start with providing a solid use-case, with some more details, explore alternatives and show how the alternatives do not work for them. Yes please, let's see some solid use-case for this. Personally i would want it to verify files after copying them: Especially while moving files: - Copy a file - drop cache - Verify that it really is correct on stable storage - Remove original file To make 100% sure you'd not only need to drop VFS-level caches but also file-system-level caches. Indeed, file-systems have their own rather buffers for different indexing data-structures, etc. The unmount/mount sequence takes care of that. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Fri, 2014-06-27 at 12:08 +0300, Artem Bityutskiy wrote: To make 100% sure you'd not only need to drop VFS-level caches but also file-system-level caches. Indeed, file-systems have their own rather Sorry, I wanted to say rather complex here buffers for different indexing data-structures, etc. The unmount/mount sequence takes care of that. -- Best Regards, Artem Bityutskiy - Intel Finland Oy Registered Address: PL 281, 00181 Helsinki Business Identity Code: 0357606 - 4 Domiciled in Helsinki This e-mail and any attachments may contain confidential material for the sole use of the intended recipient(s). Any review or distribution by others is strictly prohibited. If you are not the intended recipient, please contact the sender and delete all copies.
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, Jun 26, 2014 at 02:10:28PM +0200, Bernd Schubert wrote: > On 06/26/2014 01:57 PM, Lukáš Czerner wrote: > >On Thu, 26 Jun 2014, Artem Bityutskiy wrote: > >>On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: > >>>On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: > On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: > >Your particular use case can be handled by directing your benchmark > >at a filesystem mount point and unmounting the filesystem in between > >benchmark runs. There is no ned to adding kernel functionality for > >somethign that can be so easily acheived by other means, especially > >in benchmark environments where *everything* is tightly controlled. > > If I was a benchmark writer, I would not be willing running it as root > to be able to mount/unmount, I would not be willing to require the > customer creating special dedicated partitions for the benchmark, > because this is too user-unfriendly. Or do I make incorrect assumptions? > >>> > >>>But why a sysctl then? And also don't see a point for that at all, why > >>>can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? > >> > >>The latter question was answered - people want a way to drop caches for > >>a file. They need a method which guarantees that the caches are dropped. > >>They do not need an advisory method which does not give any guarantees. > > I'm not sure if a benchmark really needs that so much that > FADV_DONTNEED isn't sufficient. > Personally I would also like to know if FADV_DONTNEED succeeded. > I.e. 'ql-fstest' is to check if the written pattern went to the > block device and currently it does not know if data really had been > dropped from the page cache. As it reads files several times this is > not critical, but only would be a nice to have - nothing worth to > add a new syscall. ql-test is not a benchmark, it's a data integrity test. The re-read verification problem is easily solved by using direct IO to read the files directly without going through the page cache. Indeed, direct IO will invalidate cached pages over the range it reads before it does the read, so the guarantee that you are after - no cached pages when the read is done - is also fulfilled by the direct IO read... I really don't understand why people keep trying to make cached IO behave like uncached IO when we already have uncached IO interfaces Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, Jun 26, 2014 at 09:13:19AM +0300, Artem Bityutskiy wrote: > On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: > > Your particular use case can be handled by directing your benchmark > > at a filesystem mount point and unmounting the filesystem in between > > benchmark runs. There is no ned to adding kernel functionality for > > somethign that can be so easily acheived by other means, especially > > in benchmark environments where *everything* is tightly controlled. > > If I was a benchmark writer, I would not be willing running it as root > to be able to mount/unmount, I would not be willing to require the > customer creating special dedicated partitions for the benchmark, > because this is too user-unfriendly. Or do I make incorrect assumptions? Just add the dev/mntpt to /etc/fstab and add "user" to the configuration and the need for root goes away. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 06/26/2014 01:57 PM, Lukáš Czerner wrote: On Thu, 26 Jun 2014, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. I'm not sure if a benchmark really needs that so much that FADV_DONTNEED isn't sufficient. Personally I would also like to know if FADV_DONTNEED succeeded. I.e. 'ql-fstest' is to check if the written pattern went to the block device and currently it does not know if data really had been dropped from the page cache. As it reads files several times this is not critical, but only would be a nice to have - nothing worth to add a new syscall. Cheers, Bernd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, 26 Jun 2014, Artem Bityutskiy wrote: > Date: Thu, 26 Jun 2014 14:31:03 +0300 > From: Artem Bityutskiy > To: Bernd Schubert > Cc: Dave Chinner , Thomas Knauth , > David Rientjes , > Maksym Planeta , > Alexander Viro , linux-fsde...@vger.kernel.org, > linux-kernel@vger.kernel.org > Subject: Re: [PATCH] sysctl: Add a feature to drop caches selectively > > On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: > > On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: > > > On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: > > >> Your particular use case can be handled by directing your benchmark > > >> at a filesystem mount point and unmounting the filesystem in between > > >> benchmark runs. There is no ned to adding kernel functionality for > > >> somethign that can be so easily acheived by other means, especially > > >> in benchmark environments where *everything* is tightly controlled. > > > > > > If I was a benchmark writer, I would not be willing running it as root > > > to be able to mount/unmount, I would not be willing to require the > > > customer creating special dedicated partitions for the benchmark, > > > because this is too user-unfriendly. Or do I make incorrect assumptions? > > > > But why a sysctl then? And also don't see a point for that at all, why > > can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? > > The latter question was answered - people want a way to drop caches for > a file. They need a method which guarantees that the caches are dropped. > They do not need an advisory method which does not give any guarantees. > > As for the first question - this was what I was also asking too, but > without suggesting alternatives. I challenged the authors with the > following: > > 1. Why the interface would only allow the super user dropping the > caches? How about allowing the file owner or, generally speaking, the > person who is allowed to modify the file, drop the caches? > > I alluded that this may be doable with an fd-based interface. > > 2. What about symlinks? Can I have a choice whether I drop caches > (struct inode, I suppose) for the symlink itself or for the destination > file? Again, fd-based interface would probably naturally allow for this. > > 3. What about leaving some room for future extensions? E.g., someone may > want to drop only part of a file in the future, who knows. Can we invent > an interface which would allow to be extended in the future, without > breaking older software? > > My intention was to encourage the submitter to take some time and come > back with deeper analysis. > > And finally, and most importantly, Dave stated that any per-file cache > dropping interface is unlikely going to be accepted at all, because > there is mount/unmount. > > So far this is the mane concern the submitter should address. > > But I just answered that what Dave suggested is probably not the nicest > way to do this from the user-space perspective, because it requires > superuser privileges, and probably a separate "benchmark-only" > partition. I think that Dave is right in that if it's just for a "benchmarking" purposes, then there is no need for a new special interface for dropping caches. There is mount/umount and drop_caches which should be more than enough for any benchmark. And while it's true that you'd likely need superuser privileges for mount/umount, the same is true about drop_caches, isn't it ? > > So if the authors want to sell this new interface (in whatever form) to > the kernel community, they should start with providing a solid use-case, > with some more details, explore alternatives and show how the > alternatives do not work for them. Yes please, let's see some solid use-case for this. -Lukas -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: > On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: > > On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: > >> Your particular use case can be handled by directing your benchmark > >> at a filesystem mount point and unmounting the filesystem in between > >> benchmark runs. There is no ned to adding kernel functionality for > >> somethign that can be so easily acheived by other means, especially > >> in benchmark environments where *everything* is tightly controlled. > > > > If I was a benchmark writer, I would not be willing running it as root > > to be able to mount/unmount, I would not be willing to require the > > customer creating special dedicated partitions for the benchmark, > > because this is too user-unfriendly. Or do I make incorrect assumptions? > > But why a sysctl then? And also don't see a point for that at all, why > can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. As for the first question - this was what I was also asking too, but without suggesting alternatives. I challenged the authors with the following: 1. Why the interface would only allow the super user dropping the caches? How about allowing the file owner or, generally speaking, the person who is allowed to modify the file, drop the caches? I alluded that this may be doable with an fd-based interface. 2. What about symlinks? Can I have a choice whether I drop caches (struct inode, I suppose) for the symlink itself or for the destination file? Again, fd-based interface would probably naturally allow for this. 3. What about leaving some room for future extensions? E.g., someone may want to drop only part of a file in the future, who knows. Can we invent an interface which would allow to be extended in the future, without breaking older software? My intention was to encourage the submitter to take some time and come back with deeper analysis. And finally, and most importantly, Dave stated that any per-file cache dropping interface is unlikely going to be accepted at all, because there is mount/unmount. So far this is the mane concern the submitter should address. But I just answered that what Dave suggested is probably not the nicest way to do this from the user-space perspective, because it requires superuser privileges, and probably a separate "benchmark-only" partition. So if the authors want to sell this new interface (in whatever form) to the kernel community, they should start with providing a solid use-case, with some more details, explore alternatives and show how the alternatives do not work for them. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? Cheers, Bernd -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
> With a binary interface like an ioctl I can see how you could have extra > unused fields which you can ignore now and let people start adding extra > options like the range in the future. Yes, ioctl is another possibility. But I would argue that sysctl is more convenient interface, because idea of sdrop_caches is similar to drop_caches's one and it is convenient to have these interfaces in the same place. But if sdrop_caches uses procfs it seems that there is no easy way to pass parameters of different types in one write operation. > Other questions I'd ask would be - how about the access control model? > Will only root be able to drop caches? Why can't I drop caches for my > own file? Access control model is the same as for drop_caches. This means that only root can write to this file. But it is easy to add a feature that allows any user to clean page cache of inodes that this user owns. 2014-06-25 15:42 GMT+02:00 Artem Bityutskiy : > On Wed, 2014-06-25 at 15:23 +0200, Thomas Knauth wrote: >> On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy >> wrote: >> > Thanks for the answer, although you forgot to comment on the question >> > about possibly extending the new interface to work with file ranges in >> > the future. For example, I have a 2 TiB file, and I am only interested >> > in dropping caches for the first couple of gigabytes. Would I extend >> > your interface, or would I come up with another one? >> >> Ah, didn't quite understand what was meant with file ranges. Again, we >> had not considered this so far. I guess you could make a distinction >> between directories and files here. If the path points to a file, you >> can have an optional argument indicating the range of bytes you would >> like to drop. Something like >> >> echo "my-file 0-1000,8000-1000" > /proc/sys/vm/sdrop_cache >> >> If this is desirable, we can add it to the patch. > > With a binary interface like an ioctl I can see how you could have extra > unused fields which you can ignore now and let people start adding extra > options like the range in the future. > > With this kind of interface I am not sure how to do this. > > Other questions I'd ask would be - how about the access control model? > Will only root be able to drop caches? Why can't I drop caches for my > own file? > > I did not put much thinking into this, but it looks like ioctl could be > a better interface for the task you are trying to solve... > > Sorry if I am a bit vague, I am mostly trying to make you guys give this > more thoughts, and come up with a deeper analysis. Interfaces are very > important to get right, or as right as possible... > > -- > Best Regards, > Artem Bityutskiy > -- Regards, Maksym Planeta. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: > Your particular use case can be handled by directing your benchmark > at a filesystem mount point and unmounting the filesystem in between > benchmark runs. There is no ned to adding kernel functionality for > somethign that can be so easily acheived by other means, especially > in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? Not that I need this syscall and trying to sell the idea to anyone, just trying to understand the alternative you suggested. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? Not that I need this syscall and trying to sell the idea to anyone, just trying to understand the alternative you suggested. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
With a binary interface like an ioctl I can see how you could have extra unused fields which you can ignore now and let people start adding extra options like the range in the future. Yes, ioctl is another possibility. But I would argue that sysctl is more convenient interface, because idea of sdrop_caches is similar to drop_caches's one and it is convenient to have these interfaces in the same place. But if sdrop_caches uses procfs it seems that there is no easy way to pass parameters of different types in one write operation. Other questions I'd ask would be - how about the access control model? Will only root be able to drop caches? Why can't I drop caches for my own file? Access control model is the same as for drop_caches. This means that only root can write to this file. But it is easy to add a feature that allows any user to clean page cache of inodes that this user owns. 2014-06-25 15:42 GMT+02:00 Artem Bityutskiy dedeki...@gmail.com: On Wed, 2014-06-25 at 15:23 +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Thanks for the answer, although you forgot to comment on the question about possibly extending the new interface to work with file ranges in the future. For example, I have a 2 TiB file, and I am only interested in dropping caches for the first couple of gigabytes. Would I extend your interface, or would I come up with another one? Ah, didn't quite understand what was meant with file ranges. Again, we had not considered this so far. I guess you could make a distinction between directories and files here. If the path points to a file, you can have an optional argument indicating the range of bytes you would like to drop. Something like echo my-file 0-1000,8000-1000 /proc/sys/vm/sdrop_cache If this is desirable, we can add it to the patch. With a binary interface like an ioctl I can see how you could have extra unused fields which you can ignore now and let people start adding extra options like the range in the future. With this kind of interface I am not sure how to do this. Other questions I'd ask would be - how about the access control model? Will only root be able to drop caches? Why can't I drop caches for my own file? I did not put much thinking into this, but it looks like ioctl could be a better interface for the task you are trying to solve... Sorry if I am a bit vague, I am mostly trying to make you guys give this more thoughts, and come up with a deeper analysis. Interfaces are very important to get right, or as right as possible... -- Best Regards, Artem Bityutskiy -- Regards, Maksym Planeta. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? Cheers, Bernd -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. As for the first question - this was what I was also asking too, but without suggesting alternatives. I challenged the authors with the following: 1. Why the interface would only allow the super user dropping the caches? How about allowing the file owner or, generally speaking, the person who is allowed to modify the file, drop the caches? I alluded that this may be doable with an fd-based interface. 2. What about symlinks? Can I have a choice whether I drop caches (struct inode, I suppose) for the symlink itself or for the destination file? Again, fd-based interface would probably naturally allow for this. 3. What about leaving some room for future extensions? E.g., someone may want to drop only part of a file in the future, who knows. Can we invent an interface which would allow to be extended in the future, without breaking older software? My intention was to encourage the submitter to take some time and come back with deeper analysis. And finally, and most importantly, Dave stated that any per-file cache dropping interface is unlikely going to be accepted at all, because there is mount/unmount. So far this is the mane concern the submitter should address. But I just answered that what Dave suggested is probably not the nicest way to do this from the user-space perspective, because it requires superuser privileges, and probably a separate benchmark-only partition. So if the authors want to sell this new interface (in whatever form) to the kernel community, they should start with providing a solid use-case, with some more details, explore alternatives and show how the alternatives do not work for them. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, 26 Jun 2014, Artem Bityutskiy wrote: Date: Thu, 26 Jun 2014 14:31:03 +0300 From: Artem Bityutskiy dedeki...@gmail.com To: Bernd Schubert bernd.schub...@itwm.fraunhofer.de Cc: Dave Chinner da...@fromorbit.com, Thomas Knauth thomas.kna...@gmx.de, David Rientjes rient...@google.com, Maksym Planeta mcsim.plan...@gmail.com, Alexander Viro v...@zeniv.linux.org.uk, linux-fsde...@vger.kernel.org, linux-kernel@vger.kernel.org Subject: Re: [PATCH] sysctl: Add a feature to drop caches selectively On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. As for the first question - this was what I was also asking too, but without suggesting alternatives. I challenged the authors with the following: 1. Why the interface would only allow the super user dropping the caches? How about allowing the file owner or, generally speaking, the person who is allowed to modify the file, drop the caches? I alluded that this may be doable with an fd-based interface. 2. What about symlinks? Can I have a choice whether I drop caches (struct inode, I suppose) for the symlink itself or for the destination file? Again, fd-based interface would probably naturally allow for this. 3. What about leaving some room for future extensions? E.g., someone may want to drop only part of a file in the future, who knows. Can we invent an interface which would allow to be extended in the future, without breaking older software? My intention was to encourage the submitter to take some time and come back with deeper analysis. And finally, and most importantly, Dave stated that any per-file cache dropping interface is unlikely going to be accepted at all, because there is mount/unmount. So far this is the mane concern the submitter should address. But I just answered that what Dave suggested is probably not the nicest way to do this from the user-space perspective, because it requires superuser privileges, and probably a separate benchmark-only partition. I think that Dave is right in that if it's just for a benchmarking purposes, then there is no need for a new special interface for dropping caches. There is mount/umount and drop_caches which should be more than enough for any benchmark. And while it's true that you'd likely need superuser privileges for mount/umount, the same is true about drop_caches, isn't it ? So if the authors want to sell this new interface (in whatever form) to the kernel community, they should start with providing a solid use-case, with some more details, explore alternatives and show how the alternatives do not work for them. Yes please, let's see some solid use-case for this. -Lukas -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On 06/26/2014 01:57 PM, Lukáš Czerner wrote: On Thu, 26 Jun 2014, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. I'm not sure if a benchmark really needs that so much that FADV_DONTNEED isn't sufficient. Personally I would also like to know if FADV_DONTNEED succeeded. I.e. 'ql-fstest' is to check if the written pattern went to the block device and currently it does not know if data really had been dropped from the page cache. As it reads files several times this is not critical, but only would be a nice to have - nothing worth to add a new syscall. Cheers, Bernd -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, Jun 26, 2014 at 09:13:19AM +0300, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? Just add the dev/mntpt to /etc/fstab and add user to the configuration and the need for root goes away. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Thu, Jun 26, 2014 at 02:10:28PM +0200, Bernd Schubert wrote: On 06/26/2014 01:57 PM, Lukáš Czerner wrote: On Thu, 26 Jun 2014, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 12:36 +0200, Bernd Schubert wrote: On 06/26/2014 08:13 AM, Artem Bityutskiy wrote: On Thu, 2014-06-26 at 11:06 +1000, Dave Chinner wrote: Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. If I was a benchmark writer, I would not be willing running it as root to be able to mount/unmount, I would not be willing to require the customer creating special dedicated partitions for the benchmark, because this is too user-unfriendly. Or do I make incorrect assumptions? But why a sysctl then? And also don't see a point for that at all, why can't the benchmark use posix_fadvise(POSIX_FADV_DONTNEED)? The latter question was answered - people want a way to drop caches for a file. They need a method which guarantees that the caches are dropped. They do not need an advisory method which does not give any guarantees. I'm not sure if a benchmark really needs that so much that FADV_DONTNEED isn't sufficient. Personally I would also like to know if FADV_DONTNEED succeeded. I.e. 'ql-fstest' is to check if the written pattern went to the block device and currently it does not know if data really had been dropped from the page cache. As it reads files several times this is not critical, but only would be a nice to have - nothing worth to add a new syscall. ql-test is not a benchmark, it's a data integrity test. The re-read verification problem is easily solved by using direct IO to read the files directly without going through the page cache. Indeed, direct IO will invalidate cached pages over the range it reads before it does the read, so the guarantee that you are after - no cached pages when the read is done - is also fulfilled by the direct IO read... I really don't understand why people keep trying to make cached IO behave like uncached IO when we already have uncached IO interfaces Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 10:25:05AM +0200, Thomas Knauth wrote: > On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy wrote: > > Plus some explanations WRT why proc-based interface and what would be > > the alternatives, what if tomorrow we want to extend the functionality > > and drop caches only for certain file range, is this only for regular > > files or also for directories, why posix_fadvice(DONTNEED) is not > > sufficient. > > I suggested the idea originally. Let me address each of your questions in > turn: > > Why a selective drop? To have a middle ground between echo 2 > > drop_caches and echo 3 > drop_caches. When is this interesting? My > particular use case was benchmarking. I wanted to repeatedly measure > the timing when things were read from disk. Dropping everything from > the cache, also drops useful things, not just the few files your > benchmark intends to measure. We're not likely to ever extend the drop_caches functionality. This is brought up semi-regularly by people that have some slightly narrower use-case for dropping caches. Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed 2014-06-25 10:25:05, Thomas Knauth wrote: > On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy wrote: > > Plus some explanations WRT why proc-based interface and what would be > > the alternatives, what if tomorrow we want to extend the functionality > > and drop caches only for certain file range, is this only for regular > > files or also for directories, why posix_fadvice(DONTNEED) is not > > sufficient. > > I suggested the idea originally. Let me address each of your questions in > turn: > > Why a selective drop? To have a middle ground between echo 2 > > drop_caches and echo 3 > drop_caches. When is this interesting? My > particular use case was benchmarking. I wanted to repeatedly measure > the timing when things were read from disk. Dropping everything from > the cache, also drops useful things, not just the few files your > benchmark intends to measure. > > Why /proc? Because this is where the current drop_caches mechanism is > located. If it should go somewhere else, please do suggest so. It sounds like this should be a new syscall. echoing filenames in files is strange/ugly. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 15:23 +0200, Thomas Knauth wrote: > On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy > wrote: > > Thanks for the answer, although you forgot to comment on the question > > about possibly extending the new interface to work with file ranges in > > the future. For example, I have a 2 TiB file, and I am only interested > > in dropping caches for the first couple of gigabytes. Would I extend > > your interface, or would I come up with another one? > > Ah, didn't quite understand what was meant with file ranges. Again, we > had not considered this so far. I guess you could make a distinction > between directories and files here. If the path points to a file, you > can have an optional argument indicating the range of bytes you would > like to drop. Something like > > echo "my-file 0-1000,8000-1000" > /proc/sys/vm/sdrop_cache > > If this is desirable, we can add it to the patch. With a binary interface like an ioctl I can see how you could have extra unused fields which you can ignore now and let people start adding extra options like the range in the future. With this kind of interface I am not sure how to do this. Other questions I'd ask would be - how about the access control model? Will only root be able to drop caches? Why can't I drop caches for my own file? I did not put much thinking into this, but it looks like ioctl could be a better interface for the task you are trying to solve... Sorry if I am a bit vague, I am mostly trying to make you guys give this more thoughts, and come up with a deeper analysis. Interfaces are very important to get right, or as right as possible... -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 15:23 +0200, Thomas Knauth wrote: > On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy > wrote: > > Thanks for the answer, although you forgot to comment on the question > > about possibly extending the new interface to work with file ranges in > > the future. For example, I have a 2 TiB file, and I am only interested > > in dropping caches for the first couple of gigabytes. Would I extend > > your interface, or would I come up with another one? > > Ah, didn't quite understand what was meant with file ranges. Again, we > had not considered this so far. I guess you could make a distinction > between directories and files here. If the path points to a file, you > can have an optional argument indicating the range of bytes you would > like to drop. Something like > > echo "my-file 0-1000,8000-1000" > /proc/sys/vm/sdrop_cache > > If this is desirable, we can add it to the patch. No, I do not ask to implement this, just trying to understand how the interface could possibly be extended. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy wrote: > Thanks for the answer, although you forgot to comment on the question > about possibly extending the new interface to work with file ranges in > the future. For example, I have a 2 TiB file, and I am only interested > in dropping caches for the first couple of gigabytes. Would I extend > your interface, or would I come up with another one? Ah, didn't quite understand what was meant with file ranges. Again, we had not considered this so far. I guess you could make a distinction between directories and files here. If the path points to a file, you can have an optional argument indicating the range of bytes you would like to drop. Something like echo "my-file 0-1000,8000-1000" > /proc/sys/vm/sdrop_cache If this is desirable, we can add it to the patch. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 12:03 PM, Artem Bityutskiy wrote: > On Wed, 2014-06-25 at 10:25 +0200, Thomas Knauth wrote: >> On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy >> wrote: >> > Plus some explanations WRT why proc-based interface and what would be >> > the alternatives, what if tomorrow we want to extend the functionality >> > and drop caches only for certain file range, is this only for regular >> > files or also for directories, why posix_fadvice(DONTNEED) is not >> > sufficient. >> >> I suggested the idea originally. Let me address each of your questions in >> turn: > > I'd also be interested to see some analysis about path-based interface > vs. file descriptor-base interface. What are cons and pros. E.g. if my > path is a symlink, with path-based interface it is not obvious whether I > drop caches for the symlink itself or caches of the target. Haven't considered this case. It feels like the sensible thing to do here is dereference the link and drop whatever it is pointing to. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 2:20 PM, Alexey Dobriyan wrote: >> +static void clean_all_dentries_locked(struct dentry *dentry) >> +{ >> + struct dentry *child; >> + >> + list_for_each_entry(child, >d_subdirs, d_u.d_child) { >> + clean_all_dentries_locked(child); >> + } >> + >> + clean_mapping(dentry); >> +} > > unbounded recursion = kernel stack overflow -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 10:25 +0200, Thomas Knauth wrote: > On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy wrote: > > Plus some explanations WRT why proc-based interface and what would be > > the alternatives, what if tomorrow we want to extend the functionality > > and drop caches only for certain file range, is this only for regular > > files or also for directories, why posix_fadvice(DONTNEED) is not > > sufficient. > > I suggested the idea originally. Let me address each of your questions in > turn: I'd also be interested to see some analysis about path-based interface vs. file descriptor-base interface. What are cons and pros. E.g. if my path is a symlink, with path-based interface it is not obvious whether I drop caches for the symlink itself or caches of the target. Note, if there are no answers, fine with me, I am asking just out of curiosity. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 10:25 +0200, Thomas Knauth wrote: > On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy wrote: > > Plus some explanations WRT why proc-based interface and what would be > > the alternatives, what if tomorrow we want to extend the functionality > > and drop caches only for certain file range, is this only for regular > > files or also for directories, why posix_fadvice(DONTNEED) is not > > sufficient. > > I suggested the idea originally. Let me address each of your questions in > turn: Thanks for the answer, although you forgot to comment on the question about possibly extending the new interface to work with file ranges in the future. For example, I have a 2 TiB file, and I am only interested in dropping caches for the first couple of gigabytes. Would I extend your interface, or would I come up with another one? > Why a selective drop? To have a middle ground between echo 2 > > drop_caches and echo 3 > drop_caches. When is this interesting? My > particular use case was benchmarking. I wanted to repeatedly measure > the timing when things were read from disk. Dropping everything from > the cache, also drops useful things, not just the few files your > benchmark intends to measure. Sounds like a reasonable motivation for me. > Why /proc? Because this is where the current drop_caches mechanism is > located. If it should go somewhere else, please do suggest so. I do not have particular suggestions, just pulling the information about how much efforts were put into choosing the interface. > Why not use posix_fadvice()? Because it is exactly this, an advice. > The kernel is free to do whatever, i.e., also ignore the request. We > want a mechanism that reliably drops select content from the cache. OK, thanks. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy wrote: > Plus some explanations WRT why proc-based interface and what would be > the alternatives, what if tomorrow we want to extend the functionality > and drop caches only for certain file range, is this only for regular > files or also for directories, why posix_fadvice(DONTNEED) is not > sufficient. I suggested the idea originally. Let me address each of your questions in turn: Why a selective drop? To have a middle ground between echo 2 > drop_caches and echo 3 > drop_caches. When is this interesting? My particular use case was benchmarking. I wanted to repeatedly measure the timing when things were read from disk. Dropping everything from the cache, also drops useful things, not just the few files your benchmark intends to measure. Why /proc? Because this is where the current drop_caches mechanism is located. If it should go somewhere else, please do suggest so. The string is a path, i.e., can be either a file or a directory. In case of a directory, we recursively drop all its contents. Why not use posix_fadvice()? Because it is exactly this, an advice. The kernel is free to do whatever, i.e., also ignore the request. We want a mechanism that reliably drops select content from the cache. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Tue, 2014-06-24 at 14:59 -0700, David Rientjes wrote: > On Tue, 24 Jun 2014, Maksym Planeta wrote: > > > To clean the page cache one can use /proc/sys/vm/drop_caches. But this > > drops the whole page cache. In contrast to that sdrop_caches enables > > ability to drop the page cache selectively by path string. > > > > Suggested-by: Thomas Knauth > > Signed-off-by: Maksym Planeta > > Could you include some information in the commit message about why this is > useful? Specifically, why you want to drop pagecache only from a specific > path. > > The name of the sysctl is also quite non-descriptive. Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Tue, 2014-06-24 at 14:59 -0700, David Rientjes wrote: On Tue, 24 Jun 2014, Maksym Planeta wrote: To clean the page cache one can use /proc/sys/vm/drop_caches. But this drops the whole page cache. In contrast to that sdrop_caches enables ability to drop the page cache selectively by path string. Suggested-by: Thomas Knauth thomas.kna...@gmx.de Signed-off-by: Maksym Planeta mcsim.plan...@gmail.com Could you include some information in the commit message about why this is useful? Specifically, why you want to drop pagecache only from a specific path. The name of the sysctl is also quite non-descriptive. Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. I suggested the idea originally. Let me address each of your questions in turn: Why a selective drop? To have a middle ground between echo 2 drop_caches and echo 3 drop_caches. When is this interesting? My particular use case was benchmarking. I wanted to repeatedly measure the timing when things were read from disk. Dropping everything from the cache, also drops useful things, not just the few files your benchmark intends to measure. Why /proc? Because this is where the current drop_caches mechanism is located. If it should go somewhere else, please do suggest so. The string is a path, i.e., can be either a file or a directory. In case of a directory, we recursively drop all its contents. Why not use posix_fadvice()? Because it is exactly this, an advice. The kernel is free to do whatever, i.e., also ignore the request. We want a mechanism that reliably drops select content from the cache. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 10:25 +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. I suggested the idea originally. Let me address each of your questions in turn: Thanks for the answer, although you forgot to comment on the question about possibly extending the new interface to work with file ranges in the future. For example, I have a 2 TiB file, and I am only interested in dropping caches for the first couple of gigabytes. Would I extend your interface, or would I come up with another one? Why a selective drop? To have a middle ground between echo 2 drop_caches and echo 3 drop_caches. When is this interesting? My particular use case was benchmarking. I wanted to repeatedly measure the timing when things were read from disk. Dropping everything from the cache, also drops useful things, not just the few files your benchmark intends to measure. Sounds like a reasonable motivation for me. Why /proc? Because this is where the current drop_caches mechanism is located. If it should go somewhere else, please do suggest so. I do not have particular suggestions, just pulling the information about how much efforts were put into choosing the interface. Why not use posix_fadvice()? Because it is exactly this, an advice. The kernel is free to do whatever, i.e., also ignore the request. We want a mechanism that reliably drops select content from the cache. OK, thanks. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 10:25 +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. I suggested the idea originally. Let me address each of your questions in turn: I'd also be interested to see some analysis about path-based interface vs. file descriptor-base interface. What are cons and pros. E.g. if my path is a symlink, with path-based interface it is not obvious whether I drop caches for the symlink itself or caches of the target. Note, if there are no answers, fine with me, I am asking just out of curiosity. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 2:20 PM, Alexey Dobriyan adobri...@gmail.com wrote: +static void clean_all_dentries_locked(struct dentry *dentry) +{ + struct dentry *child; + + list_for_each_entry(child, dentry-d_subdirs, d_u.d_child) { + clean_all_dentries_locked(child); + } + + clean_mapping(dentry); +} unbounded recursion = kernel stack overflow -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 12:03 PM, Artem Bityutskiy dedeki...@gmail.com wrote: On Wed, 2014-06-25 at 10:25 +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. I suggested the idea originally. Let me address each of your questions in turn: I'd also be interested to see some analysis about path-based interface vs. file descriptor-base interface. What are cons and pros. E.g. if my path is a symlink, with path-based interface it is not obvious whether I drop caches for the symlink itself or caches of the target. Haven't considered this case. It feels like the sensible thing to do here is dereference the link and drop whatever it is pointing to. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Thanks for the answer, although you forgot to comment on the question about possibly extending the new interface to work with file ranges in the future. For example, I have a 2 TiB file, and I am only interested in dropping caches for the first couple of gigabytes. Would I extend your interface, or would I come up with another one? Ah, didn't quite understand what was meant with file ranges. Again, we had not considered this so far. I guess you could make a distinction between directories and files here. If the path points to a file, you can have an optional argument indicating the range of bytes you would like to drop. Something like echo my-file 0-1000,8000-1000 /proc/sys/vm/sdrop_cache If this is desirable, we can add it to the patch. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 15:23 +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Thanks for the answer, although you forgot to comment on the question about possibly extending the new interface to work with file ranges in the future. For example, I have a 2 TiB file, and I am only interested in dropping caches for the first couple of gigabytes. Would I extend your interface, or would I come up with another one? Ah, didn't quite understand what was meant with file ranges. Again, we had not considered this so far. I guess you could make a distinction between directories and files here. If the path points to a file, you can have an optional argument indicating the range of bytes you would like to drop. Something like echo my-file 0-1000,8000-1000 /proc/sys/vm/sdrop_cache If this is desirable, we can add it to the patch. No, I do not ask to implement this, just trying to understand how the interface could possibly be extended. -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, 2014-06-25 at 15:23 +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 11:56 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Thanks for the answer, although you forgot to comment on the question about possibly extending the new interface to work with file ranges in the future. For example, I have a 2 TiB file, and I am only interested in dropping caches for the first couple of gigabytes. Would I extend your interface, or would I come up with another one? Ah, didn't quite understand what was meant with file ranges. Again, we had not considered this so far. I guess you could make a distinction between directories and files here. If the path points to a file, you can have an optional argument indicating the range of bytes you would like to drop. Something like echo my-file 0-1000,8000-1000 /proc/sys/vm/sdrop_cache If this is desirable, we can add it to the patch. With a binary interface like an ioctl I can see how you could have extra unused fields which you can ignore now and let people start adding extra options like the range in the future. With this kind of interface I am not sure how to do this. Other questions I'd ask would be - how about the access control model? Will only root be able to drop caches? Why can't I drop caches for my own file? I did not put much thinking into this, but it looks like ioctl could be a better interface for the task you are trying to solve... Sorry if I am a bit vague, I am mostly trying to make you guys give this more thoughts, and come up with a deeper analysis. Interfaces are very important to get right, or as right as possible... -- Best Regards, Artem Bityutskiy -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed 2014-06-25 10:25:05, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. I suggested the idea originally. Let me address each of your questions in turn: Why a selective drop? To have a middle ground between echo 2 drop_caches and echo 3 drop_caches. When is this interesting? My particular use case was benchmarking. I wanted to repeatedly measure the timing when things were read from disk. Dropping everything from the cache, also drops useful things, not just the few files your benchmark intends to measure. Why /proc? Because this is where the current drop_caches mechanism is located. If it should go somewhere else, please do suggest so. It sounds like this should be a new syscall. echoing filenames in files is strange/ugly. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Wed, Jun 25, 2014 at 10:25:05AM +0200, Thomas Knauth wrote: On Wed, Jun 25, 2014 at 8:25 AM, Artem Bityutskiy dedeki...@gmail.com wrote: Plus some explanations WRT why proc-based interface and what would be the alternatives, what if tomorrow we want to extend the functionality and drop caches only for certain file range, is this only for regular files or also for directories, why posix_fadvice(DONTNEED) is not sufficient. I suggested the idea originally. Let me address each of your questions in turn: Why a selective drop? To have a middle ground between echo 2 drop_caches and echo 3 drop_caches. When is this interesting? My particular use case was benchmarking. I wanted to repeatedly measure the timing when things were read from disk. Dropping everything from the cache, also drops useful things, not just the few files your benchmark intends to measure. We're not likely to ever extend the drop_caches functionality. This is brought up semi-regularly by people that have some slightly narrower use-case for dropping caches. Your particular use case can be handled by directing your benchmark at a filesystem mount point and unmounting the filesystem in between benchmark runs. There is no ned to adding kernel functionality for somethign that can be so easily acheived by other means, especially in benchmark environments where *everything* is tightly controlled. Cheers, Dave. -- Dave Chinner da...@fromorbit.com -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Tue, 24 Jun 2014, Maksym Planeta wrote: > To clean the page cache one can use /proc/sys/vm/drop_caches. But this > drops the whole page cache. In contrast to that sdrop_caches enables > ability to drop the page cache selectively by path string. > > Suggested-by: Thomas Knauth > Signed-off-by: Maksym Planeta Could you include some information in the commit message about why this is useful? Specifically, why you want to drop pagecache only from a specific path. The name of the sysctl is also quite non-descriptive. -- To unsubscribe from this list: send the line "unsubscribe linux-kernel" in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/
[PATCH] sysctl: Add a feature to drop caches selectively
To clean the page cache one can use /proc/sys/vm/drop_caches. But this drops the whole page cache. In contrast to that sdrop_caches enables ability to drop the page cache selectively by path string. Suggested-by: Thomas Knauth Signed-off-by: Maksym Planeta --- Documentation/sysctl/vm.txt | 15 ++ fs/Makefile | 2 +- fs/sdrop_caches.c | 124 3 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 fs/sdrop_caches.c diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index bd4b34c..faad01d 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -28,6 +28,7 @@ Currently, these files are in /proc/sys/vm: - dirty_ratio - dirty_writeback_centisecs - drop_caches +- sdrop_caches - extfrag_threshold - hugepages_treat_as_movable - hugetlb_shm_group @@ -211,6 +212,20 @@ with your system. To disable them, echo 4 (bit 3) into drop_caches. == +sdrop_caches + +Writing to this will cause the kernel to drop clean caches starting from +specified path. + +To free pagecache of a file: + echo /home/user/file > /proc/sys/vm/sdrop_caches +To free pagecache of a directory and all files in it. + echo /home/user/directly > /proc/sys/vm/sdrop_caches + +Restrictions are the same as for drop_caches. + +== + extfrag_threshold This parameter affects whether the kernel will compact memory or direct diff --git a/fs/Makefile b/fs/Makefile index 4030cbf..366c7b9 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -44,7 +44,7 @@ obj-$(CONFIG_FS_MBCACHE) += mbcache.o obj-$(CONFIG_FS_POSIX_ACL) += posix_acl.o obj-$(CONFIG_NFS_COMMON) += nfs_common/ obj-$(CONFIG_COREDUMP) += coredump.o -obj-$(CONFIG_SYSCTL) += drop_caches.o +obj-$(CONFIG_SYSCTL) += drop_caches.o sdrop_caches.o obj-$(CONFIG_FHANDLE) += fhandle.o diff --git a/fs/sdrop_caches.c b/fs/sdrop_caches.c new file mode 100644 index 000..c193655 --- /dev/null +++ b/fs/sdrop_caches.c @@ -0,0 +1,124 @@ +/* + * Implement the manual selective drop pagecache function + */ + +#include + + +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include +#include + +static void clean_mapping(struct dentry *dentry) +{ + struct inode *inode = dentry->d_inode; + + if (!inode) + return; + + if ((inode->i_state & (I_FREEING|I_WILL_FREE|I_NEW)) || + (inode->i_mapping->nrpages == 0)) { + return; + } + + invalidate_mapping_pages(inode->i_mapping, 0, -1); +} + +static void clean_all_dentries_locked(struct dentry *dentry) +{ + struct dentry *child; + + list_for_each_entry(child, >d_subdirs, d_u.d_child) { + clean_all_dentries_locked(child); + } + + clean_mapping(dentry); +} + +static void clean_all_dentries(struct dentry *dentry) +{ + spin_lock_nested(>d_lock, DENTRY_D_LOCK_NESTED); + clean_all_dentries_locked(dentry); + spin_unlock(>d_lock); +} + +static int drop_pagecache(const char * __user filename) +{ + unsigned int lookup_flags = LOOKUP_FOLLOW; + struct path path; + int error; + +retry: + error = user_path_at(AT_FDCWD, filename, lookup_flags, ); + if (!error) { + /* clean */ + clean_all_dentries(path.dentry); + } + if (retry_estale(error, lookup_flags)) { + lookup_flags |= LOOKUP_REVAL; + goto retry; + } + return error; +} + +static int sdrop_ctl_handler(struct ctl_table *table, int write, +void __user *buffer, size_t *lenp, loff_t *ppos) +{ + char __user *pathname = buffer + *lenp - 1; + + put_user('\0', pathname); + + if (!write) + return 0; + + return drop_pagecache(buffer); +} + +static struct ctl_path vm_path[] = { { .procname = "vm", }, { } }; +static struct ctl_table sdrop_ctl_table[] = { + { + .procname = "sdrop_caches", + .mode = 0644, + .proc_handler = sdrop_ctl_handler, + }, + { } +}; + +static struct ctl_table_header *sdrop_proc_entry; + +/* Init function called on module entry */ +int sdrop_init(void) +{ + int ret = 0; + + sdrop_proc_entry = register_sysctl_paths(vm_path, sdrop_ctl_table); + + if (sdrop_proc_entry == NULL) { + ret = -ENOMEM; + pr_err("sdrop_caches: Couldn't create proc entry\n"); + } + + return ret; +} + +/* Cleanup function called on module exit */ +void sdrop_cleanup(void) +{ + unregister_sysctl_table(sdrop_proc_entry); +} + +module_init(sdrop_init); +module_exit(sdrop_cleanup); + +MODULE_LICENSE("GPL"); +MODULE_DESCRIPTION("Selective pagecache drop
[PATCH] sysctl: Add a feature to drop caches selectively
To clean the page cache one can use /proc/sys/vm/drop_caches. But this drops the whole page cache. In contrast to that sdrop_caches enables ability to drop the page cache selectively by path string. Suggested-by: Thomas Knauth thomas.kna...@gmx.de Signed-off-by: Maksym Planeta mcsim.plan...@gmail.com --- Documentation/sysctl/vm.txt | 15 ++ fs/Makefile | 2 +- fs/sdrop_caches.c | 124 3 files changed, 140 insertions(+), 1 deletion(-) create mode 100644 fs/sdrop_caches.c diff --git a/Documentation/sysctl/vm.txt b/Documentation/sysctl/vm.txt index bd4b34c..faad01d 100644 --- a/Documentation/sysctl/vm.txt +++ b/Documentation/sysctl/vm.txt @@ -28,6 +28,7 @@ Currently, these files are in /proc/sys/vm: - dirty_ratio - dirty_writeback_centisecs - drop_caches +- sdrop_caches - extfrag_threshold - hugepages_treat_as_movable - hugetlb_shm_group @@ -211,6 +212,20 @@ with your system. To disable them, echo 4 (bit 3) into drop_caches. == +sdrop_caches + +Writing to this will cause the kernel to drop clean caches starting from +specified path. + +To free pagecache of a file: + echo /home/user/file /proc/sys/vm/sdrop_caches +To free pagecache of a directory and all files in it. + echo /home/user/directly /proc/sys/vm/sdrop_caches + +Restrictions are the same as for drop_caches. + +== + extfrag_threshold This parameter affects whether the kernel will compact memory or direct diff --git a/fs/Makefile b/fs/Makefile index 4030cbf..366c7b9 100644 --- a/fs/Makefile +++ b/fs/Makefile @@ -44,7 +44,7 @@ obj-$(CONFIG_FS_MBCACHE) += mbcache.o obj-$(CONFIG_FS_POSIX_ACL) += posix_acl.o obj-$(CONFIG_NFS_COMMON) += nfs_common/ obj-$(CONFIG_COREDUMP) += coredump.o -obj-$(CONFIG_SYSCTL) += drop_caches.o +obj-$(CONFIG_SYSCTL) += drop_caches.o sdrop_caches.o obj-$(CONFIG_FHANDLE) += fhandle.o diff --git a/fs/sdrop_caches.c b/fs/sdrop_caches.c new file mode 100644 index 000..c193655 --- /dev/null +++ b/fs/sdrop_caches.c @@ -0,0 +1,124 @@ +/* + * Implement the manual selective drop pagecache function + */ + +#include linux/module.h + + +#include linux/kernel.h +#include linux/proc_fs.h +#include linux/string.h +#include linux/vmalloc.h +#include linux/uaccess.h +#include linux/mm.h +#include linux/fs.h +#include linux/writeback.h +#include linux/sysctl.h +#include linux/gfp.h +#include linux/limits.h +#include linux/namei.h + +static void clean_mapping(struct dentry *dentry) +{ + struct inode *inode = dentry-d_inode; + + if (!inode) + return; + + if ((inode-i_state (I_FREEING|I_WILL_FREE|I_NEW)) || + (inode-i_mapping-nrpages == 0)) { + return; + } + + invalidate_mapping_pages(inode-i_mapping, 0, -1); +} + +static void clean_all_dentries_locked(struct dentry *dentry) +{ + struct dentry *child; + + list_for_each_entry(child, dentry-d_subdirs, d_u.d_child) { + clean_all_dentries_locked(child); + } + + clean_mapping(dentry); +} + +static void clean_all_dentries(struct dentry *dentry) +{ + spin_lock_nested(dentry-d_lock, DENTRY_D_LOCK_NESTED); + clean_all_dentries_locked(dentry); + spin_unlock(dentry-d_lock); +} + +static int drop_pagecache(const char * __user filename) +{ + unsigned int lookup_flags = LOOKUP_FOLLOW; + struct path path; + int error; + +retry: + error = user_path_at(AT_FDCWD, filename, lookup_flags, path); + if (!error) { + /* clean */ + clean_all_dentries(path.dentry); + } + if (retry_estale(error, lookup_flags)) { + lookup_flags |= LOOKUP_REVAL; + goto retry; + } + return error; +} + +static int sdrop_ctl_handler(struct ctl_table *table, int write, +void __user *buffer, size_t *lenp, loff_t *ppos) +{ + char __user *pathname = buffer + *lenp - 1; + + put_user('\0', pathname); + + if (!write) + return 0; + + return drop_pagecache(buffer); +} + +static struct ctl_path vm_path[] = { { .procname = vm, }, { } }; +static struct ctl_table sdrop_ctl_table[] = { + { + .procname = sdrop_caches, + .mode = 0644, + .proc_handler = sdrop_ctl_handler, + }, + { } +}; + +static struct ctl_table_header *sdrop_proc_entry; + +/* Init function called on module entry */ +int sdrop_init(void) +{ + int ret = 0; + + sdrop_proc_entry = register_sysctl_paths(vm_path, sdrop_ctl_table); + + if (sdrop_proc_entry == NULL) { + ret = -ENOMEM; + pr_err(sdrop_caches: Couldn't create proc entry\n); + } + + return ret; +} + +/* Cleanup function called on module
Re: [PATCH] sysctl: Add a feature to drop caches selectively
On Tue, 24 Jun 2014, Maksym Planeta wrote: To clean the page cache one can use /proc/sys/vm/drop_caches. But this drops the whole page cache. In contrast to that sdrop_caches enables ability to drop the page cache selectively by path string. Suggested-by: Thomas Knauth thomas.kna...@gmx.de Signed-off-by: Maksym Planeta mcsim.plan...@gmail.com Could you include some information in the commit message about why this is useful? Specifically, why you want to drop pagecache only from a specific path. The name of the sysctl is also quite non-descriptive. -- To unsubscribe from this list: send the line unsubscribe linux-kernel in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html Please read the FAQ at http://www.tux.org/lkml/