Re: [RTEMS Project] #2944: FAT data corruption during unmount()

2017-09-06 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+--
 Reporter:  Sebastian Huber  |   Owner:  Sebastian Huber
 Type:  defect   |  Status:  closed
 Priority:  normal   |   Milestone:  4.11.3
Component:  filesystem   | Version:  4.11
 Severity:  normal   |  Resolution:  fixed
 Keywords:   |
-+--
Changes (by Sebastian Huber ):

 * status:  assigned => closed
 * resolution:   => fixed


Comment:

 In [changeset:"a3199d91f395c35accffcf1b037bc398a5a41205/rtems"
 a3199d91/rtems]:
 {{{
 #!CommitTicketReference repository="rtems"
 revision="a3199d91f395c35accffcf1b037bc398a5a41205"
 dosfs: Fix fat_file_update()

 Do not update the non-existant meta-data of the root directory.

 Close #2944.
 }}}

--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs

Re: [RTEMS Project] #2944: FAT data corruption during unmount()

2017-09-06 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+--
 Reporter:  Sebastian Huber  |   Owner:  Sebastian Huber
 Type:  defect   |  Status:  assigned
 Priority:  normal   |   Milestone:  4.11.3
Component:  filesystem   | Version:  4.11
 Severity:  normal   |  Resolution:
 Keywords:   |
-+--

Comment (by Sebastian Huber ):

 In [changeset:"4d495cfd014687fa68b17643407f3c64eb890b4f/rtems"
 4d495cf/rtems]:
 {{{
 #!CommitTicketReference repository="rtems"
 revision="4d495cfd014687fa68b17643407f3c64eb890b4f"
 dosfs: Fix fat_file_update()

 Do not update the non-existant meta-data of the root directory.

 Update #2944.
 }}}

--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs

Re: [RTEMS Project] #2944: FAT data corruption during unmount()

2017-09-06 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+--
 Reporter:  Sebastian Huber  |   Owner:  Sebastian Huber
 Type:  defect   |  Status:  assigned
 Priority:  normal   |   Milestone:  4.11.3
Component:  filesystem   | Version:  4.11
 Severity:  normal   |  Resolution:
 Keywords:   |
-+--
Changes (by Sebastian Huber):

 * version:  4.12 => 4.11
 * milestone:  4.12.0 => 4.11.3


--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs

Re: [RTEMS Project] #2944: FAT data corruption during unmount()

2017-08-13 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+---
 Reporter:  Sebastian Huber  |   Owner:  chrisj@…
 Type:  defect   |  Status:  new
 Priority:  normal   |   Milestone:  4.12.0
Component:  filesystem   | Version:  4.12
 Severity:  normal   |  Resolution:
 Keywords:   |
-+---
Changes (by Chris Johns):

 * version:  4.11 => 4.12


--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs

Re: [RTEMS Project] #2944: FAT data corruption during unmount()

2017-03-28 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+---
 Reporter:  Sebastian Huber  |   Owner:  chrisj@…
 Type:  defect   |  Status:  new
 Priority:  normal   |   Milestone:  4.12
Component:  filesystem   | Version:  4.11
 Severity:  normal   |  Resolution:
 Keywords:   |
-+---

Comment (by slemstick):

 Replying to [comment:1 Chris Johns]:
 > Replying to [ticket:2944 Sebastian Huber]:
 > > Removing the function call in msdos_shut_down ( .. ) to close the root
 file descriptor solves the problem perfectly (clean fsck).
 >
 > I assume you mean fat_file_close?
 Yes.
 >
 > > However, we're a bit unsure about the intent behind closing the root
 directory.
 >
 > There is memory allocated in fat_file_open which you would leak.

 We fixed this issue by creating a special "root file close" function, by
 removing the call to fat_file_update() in fat_file_close() (which caused
 the corruption).

 >
 > I see the fat_file_close calls fat_buf_release and if the fs_info cache
 is not empty it is holding a bdbuf buffer so this would cause a leak of
 buffers.
 >
 > What about the fat_file_close calls in the msdos init call on error?
 Would they also cause the same problem?

 Yes, these will cause the same issues.

 To update / summarise this ticket a bit here:

 We originally attempted a fix to this problem by setting the hard-coded
 root directory cluster number to 2, as well as the above (remove
 corruption caused by fat_file_update() in fat_file_close() on unmount).

 However, our attempt to fix the broken root cluster numbering breaks a
 hashing mechanism in fat_file_open(..). This mechanism indexes open file
 descriptors based on 1) parent directory cluster number and 2) offset into
 that directory structure. The issue is that the root directory, and the
 file pointed to by the first directory entry in the root directory, may
 construct their hashes based on the same indexes:

 > Root directory: cluster number 2, offset 0
 > First file in root directory: cluster number 2, offset 0

 Before, this was not a problem of course, as the root directory had the
 hard-coded cluster number of 1, and the keys were therefore always unique.
 But this can actually cause a number of new issues.

 The fix to this problem is to set the hard-coded root cluster directory
 number back to 1, instead of drastically changing the key hashing method
 function calls and data structures, and trusting that removing calls to
 fat_file_update(on_root_node) are sufficient to avoid the data corruption
 issue described above.

 However, there are two other places in msdos_misc.c where the hardcoded
 root directory cluster number - FAT_ROOTDIR_CLUSTER_NUM - is used:

 > msdos_get_name_node()
 > msdos_get_dotdot_dir_info_cluster_num()

 Like this:

 if ( (MSDOS_EXTRACT_CLUSTER_NUM(dotdot_node)) == 0)
 {
 /*
  * we handle root dir for all FAT types in the same way with the
  * ordinary directories ( through fat_file_* calls )
  */
 fat_dir_pos_init(dir_pos);
 dir_pos->sname.cln = FAT_ROOTDIR_CLUSTER_NUM;
 }

 Which, to my understanding, will never occur as you should never have a
 cluster number below 2 in a compliant (msdos) FAT file system. Does anyone
 know the intent behind this?

--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs

Re: [RTEMS Project] #2944: FAT data corruption during unmount()

2017-03-20 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+---
 Reporter:  Sebastian Huber  |   Owner:  chrisj@…
 Type:  defect   |  Status:  new
 Priority:  normal   |   Milestone:  4.12
Component:  filesystem   | Version:  4.11
 Severity:  normal   |  Resolution:
 Keywords:   |
-+---

Comment (by Chris Johns):

 Replying to [ticket:2944 Sebastian Huber]:
 > Removing the function call in msdos_shut_down ( .. ) to close the root
 file descriptor solves the problem perfectly (clean fsck).

 I assume you mean fat_file_close?

 > However, we're a bit unsure about the intent behind closing the root
 directory.

 There is memory allocated in fat_file_open which you would leak.

 I see the fat_file_close calls fat_buf_release and if the fs_info cache is
 not empty it is holding a bdbuf buffer so this would cause a leak of
 buffers.

 What about the fat_file_close calls in the msdos init call on error? Would
 they also cause the same problem?

--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs

[RTEMS Project] #2944: FAT data corruption during unmount()

2017-03-20 Thread RTEMS trac
#2944: FAT data corruption during unmount()
-+--
 Reporter:  Sebastian Huber  |  Owner:  chrisj@…
 Type:  defect   | Status:  new
 Priority:  normal   |  Milestone:  4.12
Component:  filesystem   |Version:  4.11
 Severity:  normal   |   Keywords:
-+--
 https://lists.rtems.org/pipermail/users/2017-March/031101.html

 In msdos_shut_down ( msdos_fsunmount.c ) there is a call to
 fat_file_close( .. ) which attempts to close a file
 descriptor and write a range of metadata to that file's director entry
 located in another cluster:

 * fat_file_write_first_cluster_num
 * fat_file_write_file_size
 * fat_file_write_time_and_date

 The problem is that this is the root node, and of course doesn't have a
 corresponding parent directory entry.

 In addition, the "parent directory entry" cluster number is initialised to
 0x1 (FAT_ROOTDIR_CLUSTER_NUM)
 which is not working according to the FAT specification (cluster numbering
 starts at 2).
 This actually creates a critical bug that overwrites random data to above
 sectors, because 2 is subtracted from 1
 to calculate the sector number of the cluster -> through a series of
 function calls -> leads to a sector number at
 the end of FAT2 (just below the start of the cluster region). The driver
 believes this is a FAT region (in fat_buf_release),
 writes the sector to what it "thinks" is FAT1, proceeds to copy the
 changes to FAT2 -> adds FAT_LENGTH (8161) to sector,
 leading to a write well into the cluster region, randomly overwriting
 files.

 The three function calls above lead to fsck complaining about disk
 structure:

 ###

 fsck from util-linux 2.27.1
 fsck.fat 3.0.28 (2015-05-16)
 0x41: Dirty bit is set. Fs was not properly unmounted and some data may be
 corrupt.
 1) Remove dirty bit
 2) No action
 ? 2
 There are differences between boot sector and its backup.
 This is mostly harmless. Differences: (offset:original/backup)
   65:01/00
 1) Copy original to backup
 2) Copy backup to original
 3) No action
 ? 3
 /  and
 /APPLICAT.ION
   share clusters.
   Truncating second to 0 bytes because first is FAT32 root dir.
 /APPLICAT.ION
   File size is 4096 bytes, cluster chain length is 0 bytes.
   Truncating file to 0 bytes.
 Perform changes ? (y/n) n
 /dev/sdm1: 14 files, 1600/1044483 clusters

 

 In particular the "shared cluster" problem is caused by
 fat_file_write_first_cluster_num, which adds a directory
 entry to the root directory cluster pointing at itself; e.g. there is a
 directory entry in cluster 2 pointing to
 a file in cluster 2. (Note: this occurs because we have fixed the "point
 to cluster # 1 issue" by reading the relative
 location of the root cluster node from the FAT volume info strcture).

 Removing the function call in msdos_shut_down ( .. ) to close the root
 file descriptor solves the problem perfectly
 (clean fsck). However, we're a bit unsure about the intent behind closing
 the root directory.

--
Ticket URL: 
RTEMS Project 
RTEMS Project
___
bugs mailing list
bugs@rtems.org
http://lists.rtems.org/mailman/listinfo/bugs