Re: [PATCH 01/17] pramfs: documentation

2011-01-11 Thread Roberto A. Foglietta
2011/1/8 Marco Stornelli marco.storne...@gmail.com:
 On 07/01/2011 22:59, Tony Luck wrote:
 On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
 marco.storne...@gmail.com wrote:
 constraint). About the errors: pramfs does not maintain file data in the
 page caches for normal file I/O, so no writeback, the read/write
 operation are done with direct io and they are always sync. The data are
 write protected in hw when the arch provide this facility (x86 does).
 Inode contains a checksum and when there are problems they are marked as
 bad. Superblock contains checksum and there is a redundant superblock.

 But you can still get pramfs inconsistencies if the system crashes at an
 inopportune moment. E.g. when making files you write the new inode to
 pramfs, and then you insert the entry into the directory. A crash between
 these two operations leaves an allocated inode that doesn't appear in
 any directory.  Without a fsck option, it will be hard to see that you have
 this problem, and your only recovery option is to wipe *all* files by making
 a new filesystem.

 Is it a problem if you lost some logs? However do you expect that fsck
 in this case will drop the inode?


IF there could be some inconsistencies in the file-system AND as long
as there is no way to fixup these inconsistencies than purging their
allocated space THEN I think the best approach would be clearing these
inconsistencies at the mount time and printing a WARNING message for
debug/stats purpose. Otherwise a user-space tool would be better
because it could be used in interactive mode, also.

Obviously the best would be to not have any inconsistencies at all.
However, in a real world, the thread-off between a journaling fs and a
simpler one in terms of code and memory usage could make acceptable
adopting a simpler fs than a journaled one. Kernel documentation
should inform clearly the user about pro/cons of adopting a simpler fs
especially about data loss conditions.

-RAF
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-10 Thread Pavel Machek
 On 07/01/2011 22:59, Tony Luck wrote:
  On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
  marco.storne...@gmail.com wrote:
  constraint). About the errors: pramfs does not maintain file data in the
  page caches for normal file I/O, so no writeback, the read/write
  operation are done with direct io and they are always sync. The data are
  write protected in hw when the arch provide this facility (x86 does).
  Inode contains a checksum and when there are problems they are marked as
  bad. Superblock contains checksum and there is a redundant superblock.
  
  But you can still get pramfs inconsistencies if the system crashes at an
  inopportune moment. E.g. when making files you write the new inode to
  pramfs, and then you insert the entry into the directory. A crash between
  these two operations leaves an allocated inode that doesn't appear in
  any directory.  Without a fsck option, it will be hard to see that you have
  this problem, and your only recovery option is to wipe *all* files by making
  a new filesystem.
 
 Is it a problem if you lost some logs? However do you expect that fsck
 in this case will drop the inode?

Ask it the other way around.

What is persistent filesystem good for when it is only persistent
sometimes?

You'd be better running ext2 over special block device, it is quite simple.


Pavel
-- 
(english) http://www.livejournal.com/~pavelmachek
(cesky, pictures) 
http://atrey.karlin.mff.cuni.cz/~pavel/picture/horses/blog.html
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-10 Thread Marco Stornelli
2011/1/10 Pavel Machek pa...@ucw.cz:
 On 07/01/2011 22:59, Tony Luck wrote:
  On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
  marco.storne...@gmail.com wrote:
  constraint). About the errors: pramfs does not maintain file data in the
  page caches for normal file I/O, so no writeback, the read/write
  operation are done with direct io and they are always sync. The data are
  write protected in hw when the arch provide this facility (x86 does).
  Inode contains a checksum and when there are problems they are marked as
  bad. Superblock contains checksum and there is a redundant superblock.
 
  But you can still get pramfs inconsistencies if the system crashes at an
  inopportune moment. E.g. when making files you write the new inode to
  pramfs, and then you insert the entry into the directory. A crash between
  these two operations leaves an allocated inode that doesn't appear in
  any directory.  Without a fsck option, it will be hard to see that you have
  this problem, and your only recovery option is to wipe *all* files by 
  making
  a new filesystem.

 Is it a problem if you lost some logs? However do you expect that fsck
 in this case will drop the inode?

 Ask it the other way around.

 What is persistent filesystem good for when it is only persistent
 sometimes?

 You'd be better running ext2 over special block device, it is quite simple.


Ok I can work on it. However can an userspace tool prevent the insert
of fs in linux next?

Marco
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


RE: [PATCH 01/17] pramfs: documentation

2011-01-10 Thread Luck, Tony
 You'd be better running ext2 over special block device,
 it is quite simple.

Marco,

You might want to spend some more time answering this question
(it is a particularly good one).  What are the reasons to use
pramfs, rather than a ext2 over a mem-block driver.  You covered
some in your part 0 patch (like ext2 wastes time getting optimal
block placement for rotating media). But it might be a good idea
to go back over them here.  From my (lightweight) reading of your
code, it looks like the biggest benefit is avoiding duplicating
the data in the pramfs memory region and the VM page cache ...
which is a big deal for your target audience of hand held devices
where memory is a somewhat scarce resource. But you probably
have other goodness in there too.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-10 Thread Marco Stornelli
Il 10/01/2011 18:35, Luck, Tony ha scritto:
 You'd be better running ext2 over special block device,
 it is quite simple.
 
 Marco,
 
 You might want to spend some more time answering this question
 (it is a particularly good one).  What are the reasons to use
 pramfs, rather than a ext2 over a mem-block driver.  You covered
 some in your part 0 patch (like ext2 wastes time getting optimal
 block placement for rotating media). But it might be a good idea
 to go back over them here.  From my (lightweight) reading of your
 code, it looks like the biggest benefit is avoiding duplicating
 the data in the pramfs memory region and the VM page cache ...
 which is a big deal for your target audience of hand held devices
 where memory is a somewhat scarce resource. But you probably
 have other goodness in there too.
 
 -Tony
 

I can add that you can place the fs wherever you want, ext2 not
without to build something special as Pavel said. Sincerely I don't
know what other add. I think documentation, web site information and
benchmark say all. You have got a fs that it's simple, it doesn't
consume a lot of resources (you can do a fine tuning via N and bpi
options for the metadata space for example), better in performance in
this environment, with the memory protection feature when
availableother? I could write a piece of code that it turn on your
coffee machine at morning, what do you think? :)

Marco
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-08 Thread Marco Stornelli
On 07/01/2011 22:59, Tony Luck wrote:
 On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
 marco.storne...@gmail.com wrote:
 constraint). About the errors: pramfs does not maintain file data in the
 page caches for normal file I/O, so no writeback, the read/write
 operation are done with direct io and they are always sync. The data are
 write protected in hw when the arch provide this facility (x86 does).
 Inode contains a checksum and when there are problems they are marked as
 bad. Superblock contains checksum and there is a redundant superblock.
 
 But you can still get pramfs inconsistencies if the system crashes at an
 inopportune moment. E.g. when making files you write the new inode to
 pramfs, and then you insert the entry into the directory. A crash between
 these two operations leaves an allocated inode that doesn't appear in
 any directory.  Without a fsck option, it will be hard to see that you have
 this problem, and your only recovery option is to wipe *all* files by making
 a new filesystem.

Is it a problem if you lost some logs? However do you expect that fsck
in this case will drop the inode?
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-07 Thread Tony Luck
On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli
marco.storne...@gmail.com wrote:
 +accessed data that must survive system reboots and power cycles. An
 +example usage might be system logs under /var/log, or a user address
 +book in a cell phone or PDA.

Some usage model questions:

How do you handle errors?  I see that there are a few sanity checks in the
mount path ... but there would seem to be several opportunities for the
file system to get corrupted in other ways.  Since you don't have a block
device, a standard fsck program looks challenging (though I guess you
could mmap(/dev/mem) to peek  poke at the filesystem before trying
to mount it).  Some sort of recovery path would seem useful for the address
book use model ... or do you just expect users to back their address book
up (to the cloud?) and have the phone just make a clean filesystem if any
errors are found?

What about quotas?  You have a fixed amount of persistent space, and
presumably a number of apps that the user installs on their device that
may like to use pramfs to store data.  Do you need some kernel enforcement
to stop one rogue application from using up all the space? Or do you expect that
this would be handled in some library level interface that applications will
use to access pramfs?

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-07 Thread Marco Stornelli
Il 07/01/2011 19:42, Tony Luck ha scritto:
 On Thu, Jan 6, 2011 at 4:01 AM, Marco Stornelli
 marco.storne...@gmail.com wrote:
 +accessed data that must survive system reboots and power cycles. An
 +example usage might be system logs under /var/log, or a user address
 +book in a cell phone or PDA.
 
 Some usage model questions:
 
 How do you handle errors?  I see that there are a few sanity checks in the
 mount path ... but there would seem to be several opportunities for the
 file system to get corrupted in other ways.  Since you don't have a block
 device, a standard fsck program looks challenging (though I guess you
 could mmap(/dev/mem) to peek  poke at the filesystem before trying
 to mount it).

Actually not (at least when strict devmem options is turned on) because
the memory region is marked exclusive at the moment (only a design
constraint). About the errors: pramfs does not maintain file data in the
page caches for normal file I/O, so no writeback, the read/write
operation are done with direct io and they are always sync. The data are
write protected in hw when the arch provide this facility (x86 does).
Inode contains a checksum and when there are problems they are marked as
bad. Superblock contains checksum and there is a redundant superblock.

 Some sort of recovery path would seem useful for the address
 book use model ... or do you just expect users to back their address book
 up (to the cloud?) and have the phone just make a clean filesystem if any
 errors are found?

Yeah maybe the address book can be a case not perfectly suitable, but it
was only an example. I thought about the fs as a cache in this use
case. However the designer can use this area whatever he wants,
recently I saw in a project this fs used as a system cache for decrypted
files where the files were stored in flash encrypted, so I think it's
flexible.

 What about quotas?  You have a fixed amount of persistent space, and
 presumably a number of apps that the user installs on their device that
 may like to use pramfs to store data.  Do you need some kernel enforcement
 to stop one rogue application from using up all the space? Or do you expect 
 that
 this would be handled in some library level interface that applications will
 use to access pramfs?

Sincerely in my embedded systems I've never used quotas even to save
footprint (for the kernel support I mean). I don't think it's an hot
feature in this case and other fs for embedded use as ubifs, jffs2 etc.
don't support it.

Marco
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


Re: [PATCH 01/17] pramfs: documentation

2011-01-07 Thread Tony Luck
On Fri, Jan 7, 2011 at 12:30 PM, Marco Stornelli
marco.storne...@gmail.com wrote:
 constraint). About the errors: pramfs does not maintain file data in the
 page caches for normal file I/O, so no writeback, the read/write
 operation are done with direct io and they are always sync. The data are
 write protected in hw when the arch provide this facility (x86 does).
 Inode contains a checksum and when there are problems they are marked as
 bad. Superblock contains checksum and there is a redundant superblock.

But you can still get pramfs inconsistencies if the system crashes at an
inopportune moment. E.g. when making files you write the new inode to
pramfs, and then you insert the entry into the directory. A crash between
these two operations leaves an allocated inode that doesn't appear in
any directory.  Without a fsck option, it will be hard to see that you have
this problem, and your only recovery option is to wipe *all* files by making
a new filesystem.

-Tony
--
To unsubscribe from this list: send the line unsubscribe linux-embedded in
the body of a message to majord...@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html


[PATCH 01/17] pramfs: documentation

2011-01-06 Thread Marco Stornelli
From: Marco Stornelli marco.storne...@gmail.com

Documentation for PRAMFS.

Signed-off-by: Marco Stornelli marco.storne...@gmail.com
---
diff --git a/Documentation/filesystems/pramfs.txt 
b/Documentation/filesystems/pramfs.txt
new file mode 100644
index 000..2ad536f
--- /dev/null
+++ b/Documentation/filesystems/pramfs.txt
@@ -0,0 +1,179 @@
+
+PRAMFS Overview
+===
+
+Many embedded systems have a block of non-volatile RAM separate from
+normal system memory, i.e. of which the kernel maintains no memory page
+descriptors. For such systems it would be beneficial to mount a
+fast read/write filesystem over this I/O memory, for storing frequently
+accessed data that must survive system reboots and power cycles. An
+example usage might be system logs under /var/log, or a user address
+book in a cell phone or PDA.
+
+Linux traditionally had no support for a persistent, non-volatile RAM-based
+filesystem, persistent meaning the filesystem survives a system reboot
+or power cycle intact. The RAM-based filesystems such as tmpfs and ramfs
+have no actual backing store but exist entirely in the page and buffer
+caches, hence the filesystem disappears after a system reboot or
+power cycle.
+
+A relatively straightforward solution is to write a simple block driver
+for the non-volatile RAM, and mount over it any disk-based filesystem such
+as ext2, ext3, ext4, etc.
+
+But the disk-based fs over non-volatile RAM block driver approach has
+some drawbacks:
+
+1. Complexity of disk-based fs: disk-based filesystems such as ext2/ext3/ext4
+   were designed for optimum performance on spinning disk media, so they
+   implement features such as block groups, which attempts to group inode data
+   into a contiguous set of data blocks to minimize disk seeking when accessing
+   files. For RAM there is no such concern; a file's data blocks can be
+   scattered throughout the media with no access speed penalty at all. So block
+   groups in a filesystem mounted over RAM just adds unnecessary
+   complexity. A better approach is to use a filesystem specifically
+   tailored to RAM media which does away with these disk-based features.
+   This increases the efficient use of space on the media, i.e. more
+   space is dedicated to actual file data storage and less to meta-data
+   needed to maintain that file data.
+
+2. Different problems between disks and RAM: Because PRAMFS attempts to avoid
+   filesystem corruption caused by kernel bugs, dirty pages in the page cache
+   are not allowed to be written back to the backing-store RAM. This way, an
+   errant write into the page cache will not get written back to the 
filesystem.
+   However, if the backing-store RAM is comparable in access speed to system
+   memory, the penalty of not using caching is minimal. With this consideration
+   it's better to move file data directly between the user buffers and the 
backing
+   store RAM, i.e. use direct I/O. This prevents the unnecessary populating of
+   the page cache with dirty pages. However direct I/O has to be enabled at
+   every file open. To enable direct I/O at all times for all regular files
+   requires either that applications be modified to include the O_DIRECT flag 
on
+   all file opens, or that the filesystem used performs direct I/O by default.
+
+The Persistent/Protected RAM Special Filesystem (PRAMFS) is a read/write
+filesystem that has been designed to address these issues. PRAMFS is targeted
+to fast I/O memory, and if the memory is non-volatile, the filesystem will be
+persistent.
+
+In PRAMFS, direct I/O is enabled across all files in the filesystem, in other
+words the O_DIRECT flag is forced on every open of a PRAMFS file. Also, file
+I/O in the PRAMFS is always synchronous. There is no need to block the current
+process while the transfer to/from the PRAMFS is in progress, since one of
+the requirements of the PRAMFS is that the filesystem exists in fast RAM. So
+file I/O in PRAMFS is always direct, synchronous, and never blocks.
+
+The data organization in PRAMFS can be thought of as an extremely simplified
+version of ext2, such that the ratio of data to meta-data is very high.
+
+PRAMFS supports the execute-in-place. With XIP, instead of keeping data in the
+page cache, the need to have a page cache copy is eliminated completely.
+Readwrite type operations are performed directly from/to the memory. For file
+mappings, the RAM itself is mapped directly into userspace. XIP, in addition,
+speed up the applications start-up time because it removes the needs of any
+copies.
+
+PRAMFS is write protected. The page table entries that map the backing-store
+RAM are normally marked read-only. Write operations into the filesystem
+temporarily mark the affected pages as writeable, the write operation is
+carried out with locks held, and then the page table entries is
+marked read-only again.
+This feature provides protection against filesystem corruption caused by errant
+writes into the RAM due to kernel