Re: [PATCH 01/17] pramfs: documentation
2011/1/8 Marco Stornelli marco.storne...@gmail.com: On 07/01/2011 22:59, Tony Luck wrote: On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli marco.storne...@gmail.com wrote: constraint). About the errors: pramfs does not maintain file data in the page caches for normal file I/O, so no writeback, the read/write operation are done with direct io and they are always sync. The data are write protected in hw when the arch provide this facility (x86 does). Inode contains a checksum and when there are problems they are marked as bad. Superblock contains checksum and there is a redundant superblock. But you can still get pramfs inconsistencies if the system crashes at an inopportune moment. E.g. when making files you write the new inode to pramfs, and then you insert the entry into the directory. A crash between these two operations leaves an allocated inode that doesn't appear in any directory. Without a fsck option, it will be hard to see that you have this problem, and your only recovery option is to wipe *all* files by making a new filesystem. Is it a problem if you lost some logs? However do you expect that fsck in this case will drop the inode? IF there could be some inconsistencies in the file-system AND as long as there is no way to fixup these inconsistencies than purging their allocated space THEN I think the best approach would be clearing these inconsistencies at the mount time and printing a WARNING message for debug/stats purpose. Otherwise a user-space tool would be better because it could be used in interactive mode, also. Obviously the best would be to not have any inconsistencies at all. However, in a real world, the thread-off between a journaling fs and a simpler one in terms of code and memory usage could make acceptable adopting a simpler fs than a journaled one. Kernel documentation should inform clearly the user about pro/cons of adopting a simpler fs especially about data loss conditions. -RAF -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
On 07/01/2011 22:59, Tony Luck wrote: On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli marco.storne...@gmail.com wrote: constraint). About the errors: pramfs does not maintain file data in the page caches for normal file I/O, so no writeback, the read/write operation are done with direct io and they are always sync. The data are write protected in hw when the arch provide this facility (x86 does). Inode contains a checksum and when there are problems they are marked as bad. Superblock contains checksum and there is a redundant superblock. But you can still get pramfs inconsistencies if the system crashes at an inopportune moment. E.g. when making files you write the new inode to pramfs, and then you insert the entry into the directory. A crash between these two operations leaves an allocated inode that doesn't appear in any directory. Without a fsck option, it will be hard to see that you have this problem, and your only recovery option is to wipe *all* files by making a new filesystem. Is it a problem if you lost some logs? However do you expect that fsck in this case will drop the inode? Ask it the other way around. What is persistent filesystem good for when it is only persistent sometimes? You'd be better running ext2 over special block device, it is quite simple. Pavel -- (english) http://www.livejournal.com/~pavelmachek (cesky, pictures) http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
2011/1/10 Pavel Machek pa...@ucw.cz: On 07/01/2011 22:59, Tony Luck wrote: On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli marco.storne...@gmail.com wrote: constraint). About the errors: pramfs does not maintain file data in the page caches for normal file I/O, so no writeback, the read/write operation are done with direct io and they are always sync. The data are write protected in hw when the arch provide this facility (x86 does). Inode contains a checksum and when there are problems they are marked as bad. Superblock contains checksum and there is a redundant superblock. But you can still get pramfs inconsistencies if the system crashes at an inopportune moment. E.g. when making files you write the new inode to pramfs, and then you insert the entry into the directory. A crash between these two operations leaves an allocated inode that doesn't appear in any directory. Without a fsck option, it will be hard to see that you have this problem, and your only recovery option is to wipe *all* files by making a new filesystem. Is it a problem if you lost some logs? However do you expect that fsck in this case will drop the inode? Ask it the other way around. What is persistent filesystem good for when it is only persistent sometimes? You'd be better running ext2 over special block device, it is quite simple. Ok I can work on it. However can an userspace tool prevent the insert of fs in linux next? Marco -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
RE: [PATCH 01/17] pramfs: documentation
You'd be better running ext2 over special block device, it is quite simple. Marco, You might want to spend some more time answering this question (it is a particularly good one). What are the reasons to use pramfs, rather than a ext2 over a mem-block driver. You covered some in your part 0 patch (like ext2 wastes time getting optimal block placement for rotating media). But it might be a good idea to go back over them here. From my (lightweight) reading of your code, it looks like the biggest benefit is avoiding duplicating the data in the pramfs memory region and the VM page cache ... which is a big deal for your target audience of hand held devices where memory is a somewhat scarce resource. But you probably have other goodness in there too. -Tony -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
Il 10/01/2011 18:35, Luck, Tony ha scritto: You'd be better running ext2 over special block device, it is quite simple. Marco, You might want to spend some more time answering this question (it is a particularly good one). What are the reasons to use pramfs, rather than a ext2 over a mem-block driver. You covered some in your part 0 patch (like ext2 wastes time getting optimal block placement for rotating media). But it might be a good idea to go back over them here. From my (lightweight) reading of your code, it looks like the biggest benefit is avoiding duplicating the data in the pramfs memory region and the VM page cache ... which is a big deal for your target audience of hand held devices where memory is a somewhat scarce resource. But you probably have other goodness in there too. -Tony I can add that you can place the fs wherever you want, ext2 not without to build something special as Pavel said. Sincerely I don't know what other add. I think documentation, web site information and benchmark say all. You have got a fs that it's simple, it doesn't consume a lot of resources (you can do a fine tuning via N and bpi options for the metadata space for example), better in performance in this environment, with the memory protection feature when availableother? I could write a piece of code that it turn on your coffee machine at morning, what do you think? :) Marco -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
On 07/01/2011 22:59, Tony Luck wrote: On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli marco.storne...@gmail.com wrote: constraint). About the errors: pramfs does not maintain file data in the page caches for normal file I/O, so no writeback, the read/write operation are done with direct io and they are always sync. The data are write protected in hw when the arch provide this facility (x86 does). Inode contains a checksum and when there are problems they are marked as bad. Superblock contains checksum and there is a redundant superblock. But you can still get pramfs inconsistencies if the system crashes at an inopportune moment. E.g. when making files you write the new inode to pramfs, and then you insert the entry into the directory. A crash between these two operations leaves an allocated inode that doesn't appear in any directory. Without a fsck option, it will be hard to see that you have this problem, and your only recovery option is to wipe *all* files by making a new filesystem. Is it a problem if you lost some logs? However do you expect that fsck in this case will drop the inode? -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli marco.storne...@gmail.com wrote: +accessed data that must survive system reboots and power cycles. An +example usage might be system logs under /var/log, or a user address +book in a cell phone or PDA. Some usage model questions: How do you handle errors? I see that there are a few sanity checks in the mount path ... but there would seem to be several opportunities for the file system to get corrupted in other ways. Since you don't have a block device, a standard fsck program looks challenging (though I guess you could mmap(/dev/mem) to peek poke at the filesystem before trying to mount it). Some sort of recovery path would seem useful for the address book use model ... or do you just expect users to back their address book up (to the cloud?) and have the phone just make a clean filesystem if any errors are found? What about quotas? You have a fixed amount of persistent space, and presumably a number of apps that the user installs on their device that may like to use pramfs to store data. Do you need some kernel enforcement to stop one rogue application from using up all the space? Or do you expect that this would be handled in some library level interface that applications will use to access pramfs? -Tony -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
Il 07/01/2011 19:42, Tony Luck ha scritto: On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli marco.storne...@gmail.com wrote: +accessed data that must survive system reboots and power cycles. An +example usage might be system logs under /var/log, or a user address +book in a cell phone or PDA. Some usage model questions: How do you handle errors? I see that there are a few sanity checks in the mount path ... but there would seem to be several opportunities for the file system to get corrupted in other ways. Since you don't have a block device, a standard fsck program looks challenging (though I guess you could mmap(/dev/mem) to peek poke at the filesystem before trying to mount it). Actually not (at least when strict devmem options is turned on) because the memory region is marked exclusive at the moment (only a design constraint). About the errors: pramfs does not maintain file data in the page caches for normal file I/O, so no writeback, the read/write operation are done with direct io and they are always sync. The data are write protected in hw when the arch provide this facility (x86 does). Inode contains a checksum and when there are problems they are marked as bad. Superblock contains checksum and there is a redundant superblock. Some sort of recovery path would seem useful for the address book use model ... or do you just expect users to back their address book up (to the cloud?) and have the phone just make a clean filesystem if any errors are found? Yeah maybe the address book can be a case not perfectly suitable, but it was only an example. I thought about the fs as a cache in this use case. However the designer can use this area whatever he wants, recently I saw in a project this fs used as a system cache for decrypted files where the files were stored in flash encrypted, so I think it's flexible. What about quotas? You have a fixed amount of persistent space, and presumably a number of apps that the user installs on their device that may like to use pramfs to store data. Do you need some kernel enforcement to stop one rogue application from using up all the space? Or do you expect that this would be handled in some library level interface that applications will use to access pramfs? Sincerely in my embedded systems I've never used quotas even to save footprint (for the kernel support I mean). I don't think it's an hot feature in this case and other fs for embedded use as ubifs, jffs2 etc. don't support it. Marco -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
Re: [PATCH 01/17] pramfs: documentation
On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli marco.storne...@gmail.com wrote: constraint). About the errors: pramfs does not maintain file data in the page caches for normal file I/O, so no writeback, the read/write operation are done with direct io and they are always sync. The data are write protected in hw when the arch provide this facility (x86 does). Inode contains a checksum and when there are problems they are marked as bad. Superblock contains checksum and there is a redundant superblock. But you can still get pramfs inconsistencies if the system crashes at an inopportune moment. E.g. when making files you write the new inode to pramfs, and then you insert the entry into the directory. A crash between these two operations leaves an allocated inode that doesn't appear in any directory. Without a fsck option, it will be hard to see that you have this problem, and your only recovery option is to wipe *all* files by making a new filesystem. -Tony -- To unsubscribe from this list: send the line unsubscribe linux-embedded in the body of a message to majord...@vger.kernel.org More majordomo info at http://vger.kernel.org/majordomo-info.html
[PATCH 01/17] pramfs: documentation
From: Marco Stornelli marco.storne...@gmail.com Documentation for PRAMFS. Signed-off-by: Marco Stornelli marco.storne...@gmail.com --- diff --git a/Documentation/filesystems/pramfs.txt b/Documentation/filesystems/pramfs.txt new file mode 100644 index 000..2ad536f --- /dev/null +++ b/Documentation/filesystems/pramfs.txt @@ -0,0 +1,179 @@ + +PRAMFS Overview +=== + +Many embedded systems have a block of non-volatile RAM separate from +normal system memory, i.e. of which the kernel maintains no memory page +descriptors. For such systems it would be beneficial to mount a +fast read/write filesystem over this I/O memory, for storing frequently +accessed data that must survive system reboots and power cycles. An +example usage might be system logs under /var/log, or a user address +book in a cell phone or PDA. + +Linux traditionally had no support for a persistent, non-volatile RAM-based +filesystem, persistent meaning the filesystem survives a system reboot +or power cycle intact. The RAM-based filesystems such as tmpfs and ramfs +have no actual backing store but exist entirely in the page and buffer +caches, hence the filesystem disappears after a system reboot or +power cycle. + +A relatively straightforward solution is to write a simple block driver +for the non-volatile RAM, and mount over it any disk-based filesystem such +as ext2, ext3, ext4, etc. + +But the disk-based fs over non-volatile RAM block driver approach has +some drawbacks: + +1. Complexity of disk-based fs: disk-based filesystems such as ext2/ext3/ext4 + were designed for optimum performance on spinning disk media, so they + implement features such as block groups, which attempts to group inode data + into a contiguous set of data blocks to minimize disk seeking when accessing + files. For RAM there is no such concern; a file's data blocks can be + scattered throughout the media with no access speed penalty at all. So block + groups in a filesystem mounted over RAM just adds unnecessary + complexity. A better approach is to use a filesystem specifically + tailored to RAM media which does away with these disk-based features. + This increases the efficient use of space on the media, i.e. more + space is dedicated to actual file data storage and less to meta-data + needed to maintain that file data. + +2. Different problems between disks and RAM: Because PRAMFS attempts to avoid + filesystem corruption caused by kernel bugs, dirty pages in the page cache + are not allowed to be written back to the backing-store RAM. This way, an + errant write into the page cache will not get written back to the filesystem. + However, if the backing-store RAM is comparable in access speed to system + memory, the penalty of not using caching is minimal. With this consideration + it's better to move file data directly between the user buffers and the backing + store RAM, i.e. use direct I/O. This prevents the unnecessary populating of + the page cache with dirty pages. However direct I/O has to be enabled at + every file open. To enable direct I/O at all times for all regular files + requires either that applications be modified to include the O_DIRECT flag on + all file opens, or that the filesystem used performs direct I/O by default. + +The Persistent/Protected RAM Special Filesystem (PRAMFS) is a read/write +filesystem that has been designed to address these issues. PRAMFS is targeted +to fast I/O memory, and if the memory is non-volatile, the filesystem will be +persistent. + +In PRAMFS, direct I/O is enabled across all files in the filesystem, in other +words the O_DIRECT flag is forced on every open of a PRAMFS file. Also, file +I/O in the PRAMFS is always synchronous. There is no need to block the current +process while the transfer to/from the PRAMFS is in progress, since one of +the requirements of the PRAMFS is that the filesystem exists in fast RAM. So +file I/O in PRAMFS is always direct, synchronous, and never blocks. + +The data organization in PRAMFS can be thought of as an extremely simplified +version of ext2, such that the ratio of data to meta-data is very high. + +PRAMFS supports the execute-in-place. With XIP, instead of keeping data in the +page cache, the need to have a page cache copy is eliminated completely. +Readwrite type operations are performed directly from/to the memory. For file +mappings, the RAM itself is mapped directly into userspace. XIP, in addition, +speed up the applications start-up time because it removes the needs of any +copies. + +PRAMFS is write protected. The page table entries that map the backing-store +RAM are normally marked read-only. Write operations into the filesystem +temporarily mark the affected pages as writeable, the write operation is +carried out with locks held, and then the page table entries is +marked read-only again. +This feature provides protection against filesystem corruption caused by errant +writes into the RAM due to kernel