Re: [RFD] FAT robustness

2005-07-21 Thread OGAWA Hirofumi
Hiroyuki Machida <[EMAIL PROTECTED]> writes:

>>> - Utilize noop elevator to cancel unexpected operation reordering
>> Why don't you use the barrier?
>
> You mean that using requests with barrier flag is enough and there is
> no reason to specify IO-sched ?
>
> It is better to preserve order of updating data, some circumstance
> like appending data. 
>
> At xvfat for 2.4 had own elevator function, to preserve EraseBlock unit
> ordering for memory card device. 
>
> To begin consideration for 2.6, I'd like to make it simple. But later
> we need to address to this issue. So I thought at first using "noop",
> later switch special elevator function to handle device better.

Um.. The independent updates is merged at special elevator, yes?

If so, to merge also can do it well by fs-layer, then normal elevator
can optimize the seek for HDD, no?  If it's possible, I'd not like to
depend to special elevator.

>>>- With O_SYNC, close() make flush all related data and
>>>  meta-data, then wait completion of I/O
>> What is this meaning? Why does O_SYNC only flush at close()?
>
> From application's point of view, application wants to believe 
> close()ed file is correctly written, without any corruption.
>
> At least close() need to guarantee this. It's ok every write()
> flush meta data and data and wait compeletion I/O.
>
> At least fat on 2.4.20, VFS sync inode on write() with O_SYNC,
> however it don't take care about super block. At FAT side don't care
> about O_SYNC. That's problem.

Ah, 2.6.12 was added the support of O_SYNC.  Yes, it should be the
every write().  If application need it at close() only, can use fsync().

Hiroyuki Machida <[EMAIL PROTECTED]> writes:

> I need to explain background information more. My descriptions tends
> to be depend on some knowledge about current xvfat for 2.4 kernel.
>
> I'm not a author of xvfat fo 2.4 kernel, but can explain little more.
>
> Current xvfat for 2.4 is designed to some specific flash memory card
> controller which can guarantee atomicity of operation on ERASE-BLOCK size
> unit. Xvfat for 2.4 try to merge operations on same ERASE-BLOCK under
> some ordering constrain.
>
> And xvfat for 2.4 uses own version of transaction control using
> in-core memory, not storage device like HDD nor flash ram,
> to accomplish the above goal, with minimal changes on existing
> FAT implementation. And this transaction control let FAT operations
> came from different threads to fee from mixed up, where potentially
> operation ordering problems would be caused.
>
> We'll start with HDD, however later we'll cover memory devices.
> For memory devices we may prepare another elevator functions,
> depending on property of devices or lower layer. E.g. NAND/AND flash
> have  different operation units for read/write and erase,
> and have some translation layer.

[...]

> As other messages said, some developers suggest "SoftUpdate" to be used.
> I need to consider about situation where memory devices are used, not HDD.

SoftUpdates itself is not depending to blocksize of atomicity
operation.  And same robustness with xvfat will be provided by it,
although probably multiple blocksize of atomicity operation may be
complex.

It is controling the dependency at very fine-granularity.  And by
default it will be used, because performance is very good.

This is why I'm thinking SoftUpdates is best solution and I'd like to
hear the detail.
--
OGAWA Hirofumi <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-21 Thread Hiroyuki Machida

Hi,

OGAWA Hirofumi wrote:

Hiroyuki Machida <[EMAIL PROTECTED]> writes:



We currently plan to add following features to address FAT corruption.

   - Utilize standard 2.6 features as much as possible
- Implement as options of fat, vfat and uvfat



What is the uvfat? typo (xvfat)?  Why is this an option (does it have
the big demerit)?


uvfat is another variant of vfat, like umsdos.
Xvfat for 2.4 has following directories and file organization;
most files are located at fs/xvfat.
and most of them, copied from fs/fat and fs/vfat and renamed
to have prefix like 'xvfat_'.
For 2.6, I feel that the above organization need to be changed.
And xvfat for 2.4 had some performance degradation. So I guess 'option'
is better.




- Utilize noop elevator to cancel unexpected operation reordering



Why don't you use the barrier?


You mean that using requests with barrier flag is enough and there is
no reason to specify IO-sched ?

It is better to preserve order of updating data, some circumstance
like appending data. 


At xvfat for 2.4 had own elevator function, to preserve EraseBlock unit
ordering for memory card device. 


To begin consideration for 2.6, I'd like to make it simple. But later
we need to address to this issue. So I thought at first using "noop",
later switch special elevator function to handle device better.





   - Coordinate order of operations so that update data first, meta
 data later with transaction control



Is this meaning the SoftUpdates? What does this guarantee? How does
this handle the rename(), and cyclic dependency of updates?


In <[EMAIL PROTECTED]>, I mentioned about this.




   - With O_SYNC, close() make flush all related data and
 meta-data, then wait completion of I/O



What is this meaning? Why does O_SYNC only flush at close()?


From application's point of view, application wants to believe 

close()ed file is correctly written, without any corruption.

At least close() need to guarantee this. It's ok every write()
flush meta data and data and wait compeletion I/O.

At least fat on 2.4.20, VFS sync inode on write() with O_SYNC,
however it don't take care about super block. At FAT side 
don't care about O_SYNC. That's problem.




Almost things in your email is needing the detail.



I'm thinking the SoftUpdates is best solution for now. Could you tell
the detail of your solution?


In <[EMAIL PROTECTED]>, I mentioned about this.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-21 Thread Hiroyuki Machida

Hi,

I need to explain background information more. My descriptions tends
to be depend on some knowledge about current xvfat for 2.4 kernel.

I'm not a author of xvfat fo 2.4 kernel, but can explain little more.

Current xvfat for 2.4 is designed to some specific flash memory card
controller which can guarantee atomicity of operation on ERASE-BLOCK size
unit. Xvfat for 2.4 try to merge operations on same ERASE-BLOCK under
some ordering constrain.

And xvfat for 2.4 uses own version of transaction control using 
in-core memory, not storage device like HDD nor flash ram,

to accomplish the above goal, with minimal changes on existing
FAT implementation. And this transaction control let FAT operations
came from different threads to fee from mixed up, where potentially
operation ordering problems would be caused.

We'll start with HDD, however later we'll cover memory devices.
For memory devices we may prepare another elevator functions,
depending on property of devices or lower layer. E.g. NAND/AND 
flash have  different operation units for read/write and erase,

and have some translation layer.



Paulo Marques wrote:

Hiroyuki Machida wrote:


[...]
 Q3 : I'm not sure JBD can be used for FAT improvements.   Do you 
have any comments ?



I might not be the best person to answer this, but this just seems so 
obvious:

Any comments are welcome.

If you plan to let a recently hot-unplugged device to be used in another 
OS that doesn't understand your journaling extensions, your disk will be 
corrupted.


If this is supposed to work only on OS's that understand your journaling 
extensions, then there are much better filesystems out there with 
journaling already.


I agree. Even not removable media, this situation will be occurred.
Suppose that device like audio player which acts as USB client and provide
USB Mass class target class. Embedded storage may be handled through by
USB Host side, like Win PC or Mac.


You might be able to reduce the size of the time window where hot 
removing the media will cause problems, like writting all the data first 
and update the metadata in as few operations as possible. But that just 
reduces the probability of data corruption. It doesn't eliminate it at all.




As other messages said, some developers suggest "SoftUpdate" to be used.
I need to consider about situation where memory devices are used, not HDD.

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-20 Thread Pavel Machek
Hi!

> >[...]
> > Q3 : I'm not sure JBD can be used for FAT improvements.   Do you 
> >have any comments ?
> 
> I might not be the best person to answer this, but this just seems so 
> obvious:
> 
> If you plan to let a recently hot-unplugged device to be used in another 
> OS that doesn't understand your journaling extensions, your disk will be 
> corrupted.

It will only be corrupted if you unplug it without unmounting, and it
will only be corrupted as much as non-journalling disk is. Plus, you
might intentionaly damage superblock signature on mount (an fix it on
clean umount) so that you force user to plug it back to journalling
system
Pavel

-- 
teflon -- maybe it is a trademark, but it should not be.
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-20 Thread Denis Vlasenko
On Tuesday 19 July 2005 19:58, Etienne Lorrain wrote:
> > I'd like to have a discussion about FAT robustness.
> > Please give your thought, comments and related issues.
> 
>   What I would like is to treat completely differently writing to
>  FAT (writing to a removeable drive) which need a complete "mount",
>  and just reading quickly a file (a standard use of removeable devices).
> 
>  Basically, to read you would not need to mount the partition, just
>  read /readfs/fd1 which uses two or three functions accessing /dev/fd1
>  in raw mode to read the filesystem descriptor and the root directory.
>  Same for /readfs/cdrom and /readfs/sda4 (USB drive).
>  The only cache would be the one provided by /dev/fd1 - a kind of
>  mount read-only at each file opening.
> 
>  This system would be disabled if the partition is already mounted
>  read/write somewhere - but as long as you do not try to write to
>  a removeable disk you can extract it at any time.
> 
>   The two or three function I am talking of are located in Gujin
>  "fs.c" file to access read-only FAT12/16/32, EXT2/3 and ISOFS
>  ( http://gujin.org ). Just few kilobytes - and some source
>  modifications for that use.

I think we will be better with more generic 'flush all dirty data
and mark superblock as clean asap' behaviour, aka 'weak O_SYNC',
so that we can remove e.g. USB removable almost anytime (can't safely
remove it _only while it is being written to_).
--
vda

-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-19 Thread Horst von Brand
Etienne Lorrain <[EMAIL PROTECTED]> wrote:
> > I'd like to have a discussion about FAT robustness.
> > Please give your thought, comments and related issues.

>   What I would like is to treat completely differently writing to
>  FAT (writing to a removeable drive) which need a complete "mount",
>  and just reading quickly a file (a standard use of removeable devices).

Sounds like a job for mtools(1).
-- 
Dr. Horst H. von Brand   User #22616 counter.li.org
Departamento de Informatica Fono: +56 32 654431
Universidad Tecnica Federico Santa Maria  +56 32 654239
Casilla 110-V, Valparaiso, ChileFax:  +56 32 797513
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFD] FAT robustness

2005-07-19 Thread Etienne Lorrain
> I'd like to have a discussion about FAT robustness.
> Please give your thought, comments and related issues.

  What I would like is to treat completely differently writing to
 FAT (writing to a removeable drive) which need a complete "mount",
 and just reading quickly a file (a standard use of removeable devices).

 Basically, to read you would not need to mount the partition, just
 read /readfs/fd1 which uses two or three functions accessing /dev/fd1
 in raw mode to read the filesystem descriptor and the root directory.
 Same for /readfs/cdrom and /readfs/sda4 (USB drive).
 The only cache would be the one provided by /dev/fd1 - a kind of
 mount read-only at each file opening.

 This system would be disabled if the partition is already mounted
 read/write somewhere - but as long as you do not try to write to
 a removeable disk you can extract it at any time.

  The two or three function I am talking of are located in Gujin
 "fs.c" file to access read-only FAT12/16/32, EXT2/3 and ISOFS
 ( http://gujin.org ). Just few kilobytes - and some source
 modifications for that use.

  Etienne.






___ 
Appel audio GRATUIT partout dans le monde avec le nouveau Yahoo! Messenger 
Téléchargez cette version sur http://fr.messenger.yahoo.com
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-19 Thread OGAWA Hirofumi
Hiroyuki Machida <[EMAIL PROTECTED]> writes:

> We currently plan to add following features to address FAT corruption.
>
> - Utilize standard 2.6 features as much as possible
>   - Implement as options of fat, vfat and uvfat

What is the uvfat? typo (xvfat)?  Why is this an option (does it have
the big demerit)?

>   - Utilize noop elevator to cancel unexpected operation reordering

Why don't you use the barrier?

> - Coordinate order of operations so that update data first, meta
>data later with transaction control

Is this meaning the SoftUpdates? What does this guarantee? How does
this handle the rename(), and cyclic dependency of updates?

> - With O_SYNC, close() make flush all related data and
>meta-data, then wait completion of I/O

What is this meaning? Why does O_SYNC only flush at close()?

Almost things in your email is needing the detail.
I'm thinking the SoftUpdates is best solution for now. Could you tell
the detail of your solution?
-- 
OGAWA Hirofumi <[EMAIL PROTECTED]>
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


Re: [RFD] FAT robustness

2005-07-18 Thread Paulo Marques

Hiroyuki Machida wrote:

[...]
 Q3 : I'm not sure JBD can be used for FAT improvements.   Do you 
have any comments ?


I might not be the best person to answer this, but this just seems so 
obvious:


If you plan to let a recently hot-unplugged device to be used in another 
OS that doesn't understand your journaling extensions, your disk will be 
corrupted.


If this is supposed to work only on OS's that understand your journaling 
extensions, then there are much better filesystems out there with 
journaling already.


You might be able to reduce the size of the time window where hot 
removing the media will cause problems, like writting all the data first 
and update the metadata in as few operations as possible. But that just 
reduces the probability of data corruption. It doesn't eliminate it at all.


--
Paulo Marques - www.grupopie.com

It is a mistake to think you can solve any major problems
just with potatoes.
Douglas Adams
-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/


[RFD] FAT robustness

2005-07-16 Thread Hiroyuki Machida


Folks,

I'd like to have a discussion about FAT robustness.
Please give your thought, comments and related issues.

About few years ago, we added some features to FAT, called xvfat,
so that System and FAT have robustness against unexpected media hot
unplug and ability to let applications correctly be aware the event.

Just for your reference, I put a patch to 2.4.20 kernel at
http://www.celinuxforum.org/CelfPubWiki/XvFatDiscussion?action=AttachFile&do=get&target=20050715-xvfat-2.4.20.patch
This includes following features;

Handle media removed during “mount”
Notification of media removal to application
Cancellation of I/O Elevator for Block device
Block system calls until a completion of writing
Control order of meta-data updates, using transaction   
control implemented in fs/xvfat/fwrq.c
File syscall return “error”, except umount
Japanese file name support
possible 1-N mapping issues SJIS <-> UNICODE
Dirty Flag support
TIME ZONE support

On moving to 2.6, we consider and categorize issues, again.
And we are planing to have open source project for these features
to add 2.6 kernel.  I'd like to open discussion about these features
and how to implement on 2.6 kernel.

1. Issues to be addressed

- Issues around FAT with CE devices
 - Hot unplug issues
- File System corruption on unplug  media/storage device
Almost same as power down without umount

- Notification of the event
Application need to know the event precisely
Need to more investigation

- System stability after unplug
Almost same as I/O error recovery issues discussed
at LKLM
http://developer.osdl.jp/projects/doubt/fs-consistency-and-coherency/index.html

http://groups.google.co.jp/group/linux.kernel/browse_thread/thread/b9c11bccd59e0513/4a4dd84b411c6d32?q=[RFD]+FS+behavior+(I%2FO+failure)+in+kernel+summit++lkml&rnum=1&hl=ja#4a4dd84b411c6d32


 - Other issues
- Time stamp issues
using always local time
time resolution is 2sec unit

- Issues around mapping with UNICODE and local char code
1-N mapping SJIS<-> UNICODE
Potential directory cache problem due to 1 –N mapping
Possible inconsistency problems with application side

- Support file size over 2GB

- Support dirty flag

 Q1 : First issue for discussion is "Do you have any other issues
about this?" and "Do you have any other idea to categorize
the issues?"


2.  FAT corruption on unplug  media/storage device

On starting the open source project, we focus to the following issue,
first.
- File System corruption on unplug  media/storage device
Almost same as power down without umount

And, we are planing to focus on HDD device and treat system power down
instead of unplug media, because
 A. Damages and it's counter methods may depend on property of lower
layer
E.g.
  - Memory Card
Some controller can guaranty atomicity of certain
operations
  - Flush Memory (NAND, NOR)
I/O operations may be constrained by Block Size
(e,g, 128KB) or Page Size (e.g. 2KB)
 - HDD
- Cache memory my resident inside in
		- Sector which is under writing 
		on power down may be corrupted(can't read anymore)


 B.  It may make the problem easier
- Sector size is 512 Byte
- Many developers may check with PC

 Q2 : Do you know any other storage devices and it's property, to 
	be address later?


3. Features to be developed for FAT corruption.

We currently plan to add following features to address FAT corruption.

   - Utilize standard 2.6 features as much as possible
- Implement as options of fat, vfat and uvfat
- Utilize existent journal block device (JBD) for transaction control
- Utilize noop elevator to cancel unexpected operation
 reordering
   - Coordinate order of operations so that update data first, meta
 data later with transaction control
   - With O_SYNC, close() make flush all related data and
 meta-data, then wait completion of I/O


 Q3 : I'm not sure JBD can be used for FAT improvements. 
  Do you have any comments ?



Thanks,
Hiroyuki Machida



-
To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
the body of a message to [EMAIL PROTECTED]
More majordomo info at  http://vger.kernel.org/majordomo-info.html
Please read the FAQ at  http://www.tux.org/lkml/