Re: diff: improving msdosfs write speed for large files

2012-04-05 Thread Alexander Polakov
* Mike Belopuhov m...@crypt.org.ru [120404 17:51]:
 i agree that this is a great find.  i don't really like the diff though.
 i see no point in introducing this macro.  what do others think?

Your diff looks better to me.

 Index: msdosfs/denode.h
 ===
 RCS file: /cvs/src/sys/msdosfs/denode.h,v
 retrieving revision 1.23
 diff -u -p -r1.23 denode.h
 --- msdosfs/denode.h  17 Jul 2010 19:27:07 -  1.23
 +++ msdosfs/denode.h  4 Apr 2012 12:20:23 -
 @@ -116,10 +116,11 @@ struct fatcache {
   * cache is probably pretty worthless if a file is opened by multiple
   * processes.
   */
 -#define  FC_SIZE 2   /* number of entries in the cache */
 +#define  FC_SIZE 3   /* number of entries in the cache */
  #define  FC_LASTMAP  0   /* entry the last call to pcbmap() 
 resolved
* to */
  #define  FC_LASTFC   1   /* entry for the last cluster in the 
 file */
 +#define  FC_OLASTFC  2   /* entry for the previous last cluster 
 */
  
  #define  FCE_EMPTY   0x  /* doesn't represent an actual 
 cluster # */
  
 Index: msdosfs/msdosfs_fat.c
 ===
 RCS file: /cvs/src/sys/msdosfs/msdosfs_fat.c,v
 retrieving revision 1.22
 diff -u -p -r1.22 msdosfs_fat.c
 --- msdosfs/msdosfs_fat.c 4 Jul 2011 04:30:41 -   1.22
 +++ msdosfs/msdosfs_fat.c 4 Apr 2012 12:20:26 -
 @@ -952,6 +952,13 @@ extendfile(struct denode *dep, uint32_t 
   return (error);
   }
  
 + /*
 +  * Preserve value for the last cluster before extending the file
 +  * to speed up further lookups.
 +  */
 + fc_setcache(dep, FC_OLASTFC, dep-de_fc[FC_LASTFC].fc_frcn,
 + dep-de_fc[FC_LASTFC].fc_fsrcn);
 +
   while (count  0) {
   /*
* Allocate a new cluster chain and cat onto the end of the

-- 
Alexander Polakov | plhk.ru



Re: diff: improving msdosfs write speed for large files

2012-04-05 Thread Mike Belopuhov
On Thu, Apr 5, 2012 at 9:21 AM, Alexander Polakov polac...@gmail.com wrote:
 * Mike Belopuhov m...@crypt.org.ru [120404 17:51]:
 i agree that this is a great find.  i don't really like the diff though.
 i see no point in introducing this macro.  what do others think?

 Your diff looks better to me.


the diff is in. thanks for pointing this out to us!



Re: diff: improving msdosfs write speed for large files

2012-04-04 Thread Alexander Hall
Alexander Polakov polac...@gmail.com wrote:

This is a diff from NetBSD pr.34583:
http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=34583

Quoting the author:

   I noticed that when writing large file (hundreds of megabytes)
   to an msdos disk, the writing speed to a file decreases with the
   file length.
   Since I have some experience with messydos filesystems (I wrote
   MSH: for the Amiga) I took a look.
   The obvious suspicion with operations that slow down with the
   length of a file is an excessive traversal of the FAT cluster
   chain. However, there is a cache that caches 2 positions: the
   last cluster in the file, and the most recently looked up one.
   Debugging info showed however that frequent full traversals were
   still made. So, apparently when extending a file and after
   updating the end cluster, the previous end is again needed.
   Adding a 3rd entry in the cache, which keeps the end position
   from just before extending a file.
   This has the desired effect of keeping the write speed constant.
   (What it is that needs that position I have not been able to
   ascertain from the filesystem code; it doesn't seem to make
   sense, actually, to read or write clusters before the original
   EOF. I was hoping to find the place where the cache is trashed
   and rewrite it to get the desired info from it beforehand, so
   that the extra cache entry is again unneeded, but alas.)

While there, I changed 0 to NULL for two pointer arguments of
extendfile().

Index: sys/msdosfs/denode.h
===
RCS file: /cvs/src/sys/msdosfs/denode.h,v
retrieving revision 1.23
diff -u -p -u -r1.23 denode.h
--- sys/msdosfs/denode.h   17 Jul 2010 19:27:07 -  1.23
+++ sys/msdosfs/denode.h   1 Apr 2012 14:27:48 -
@@ -116,10 +116,11 @@ struct fatcache {
  * cache is probably pretty worthless if a file is opened by multiple
  * processes.
  */
-#define   FC_SIZE 2   /* number of entries in the cache */
+#define   FC_SIZE 3   /* number of entries in the cache */
 #define   FC_LASTMAP  0   /* entry the last call to pcbmap() 
 resolved
* to */
 #define   FC_LASTFC   1   /* entry for the last cluster in the 
 file */
+#define   FC_NEXTTOLASTFC 2   /* entry for a close to the last 
cluster in
the file */
 
#defineFCE_EMPTY   0x  /* doesn't represent an actual 
cluster #
*/
 
@@ -130,6 +131,12 @@ struct fatcache {
   (dep)-de_fc[slot].fc_frcn = frcn; \
   (dep)-de_fc[slot].fc_fsrcn = fsrcn;
 
+#define fc_last_to_nexttolast(dep) \
+  do {  \
+  (dep)-de_fc[FC_NEXTTOLASTFC].fc_frcn =
(dep)-de_fc[FC_LASTFC].fc_frcn; \
+  (dep)-de_fc[FC_NEXTTOLASTFC].fc_fsrcn =
(dep)-de_fc[FC_LASTFC].fc_fsrcn; \
+  } while (0)
+   
 /*
* This is the in memory variant of a dos directory entry.  It is
usually
  * contained within a vnode.
Index: sys/msdosfs/msdosfs_fat.c
===
RCS file: /cvs/src/sys/msdosfs/msdosfs_fat.c,v
retrieving revision 1.22
diff -u -p -u -r1.22 msdosfs_fat.c
--- sys/msdosfs/msdosfs_fat.c  4 Jul 2011 04:30:41 -   1.22
+++ sys/msdosfs/msdosfs_fat.c  1 Apr 2012 14:27:49 -
@@ -952,6 +952,8 @@ extendfile(struct denode *dep, uint32_t 
   return (error);
   }
 
+  fc_last_to_nexttolast(dep);
+
   while (count  0) {
   /*
* Allocate a new cluster chain and cat onto the end of the
Index: sys/msdosfs/msdosfs_lookup.c
===
RCS file: /cvs/src/sys/msdosfs/msdosfs_lookup.c,v
retrieving revision 1.24
diff -u -p -u -r1.24 msdosfs_lookup.c
--- sys/msdosfs/msdosfs_lookup.c   4 Jul 2011 04:30:41 -   1.24
+++ sys/msdosfs/msdosfs_lookup.c   1 Apr 2012 14:27:49 -
@@ -620,8 +620,9 @@ createde(struct denode *dep, struct deno
   diroffset = ddep-de_fndoffset + sizeof(struct direntry)
   - ddep-de_FileSize;
   dirclust = de_clcount(pmp, diroffset);
-  if ((error = extendfile(ddep, dirclust, 0, 0, DE_CLEAR)) != 0) {
-  (void)detrunc(ddep, ddep-de_FileSize, 0, NOCRED, NULL);
+  error = extendfile(ddep, dirclust, NULL, NULL, DE_CLEAR);
+  if (error) {
+  (void) detrunc(ddep, ddep-de_FileSize, 0, NOCRED, 
NULL);
   return error;
   }

tests:

w/o the patch:
 time cp huge.file /mnt/storage/
   4m5.87s real 0m0.04s user 0m17.56s system

w/the patch:
 time cp huge.file /mnt/storage/
   2m22.48s real 0m0.02s user 0m45.30s system

-- 
Alexander Polakov | plhk.ru

Just curious; how huge is that file?

/Alexander



Re: diff: improving msdosfs write speed for large files

2012-04-04 Thread Alexander Polakov
* Alexander Hall alexan...@beard.se [120404 16:16]:
 Alexander Polakov polac...@gmail.com wrote:
 tests:
 
 w/o the patch:
  time cp huge.file /mnt/storage/
4m5.87s real 0m0.04s user 0m17.56s system
 
 w/the patch:
  time cp huge.file /mnt/storage/
2m22.48s real 0m0.02s user 0m45.30s system
 
 -- 
 Alexander Polakov | plhk.ru
 
 Just curious; how huge is that file?

$ ls -lah huge.file
 -rw-r--r--  1 estet  users   500M Apr  4 08:41 huge.file

-- 
Alexander Polakov | plhk.ru



Re: diff: improving msdosfs write speed for large files

2012-04-04 Thread Mike Belopuhov
On Wed, Apr 04, 2012 at 14:42 +0400, Alexander Polakov wrote:
 This is a diff from NetBSD pr.34583:
 http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=34583
 
 Quoting the author:
 
   I noticed that when writing large file (hundreds of megabytes)
   to an msdos disk, the writing speed to a file decreases with the
   file length.
   Since I have some experience with messydos filesystems (I wrote
   MSH: for the Amiga) I took a look.
   The obvious suspicion with operations that slow down with the
   length of a file is an excessive traversal of the FAT cluster
   chain. However, there is a cache that caches 2 positions: the
   last cluster in the file, and the most recently looked up one.
   Debugging info showed however that frequent full traversals were
   still made. So, apparently when extending a file and after
   updating the end cluster, the previous end is again needed.
   Adding a 3rd entry in the cache, which keeps the end position
   from just before extending a file.
   This has the desired effect of keeping the write speed constant.
   (What it is that needs that position I have not been able to
   ascertain from the filesystem code; it doesn't seem to make
   sense, actually, to read or write clusters before the original
   EOF. I was hoping to find the place where the cache is trashed
   and rewrite it to get the desired info from it beforehand, so
   that the extra cache entry is again unneeded, but alas.)
 
 While there, I changed 0 to NULL for two pointer arguments of
 extendfile().
 

i agree that this is a great find.  i don't really like the diff though.
i see no point in introducing this macro.  what do others think?

Index: msdosfs/denode.h
===
RCS file: /cvs/src/sys/msdosfs/denode.h,v
retrieving revision 1.23
diff -u -p -r1.23 denode.h
--- msdosfs/denode.h17 Jul 2010 19:27:07 -  1.23
+++ msdosfs/denode.h4 Apr 2012 12:20:23 -
@@ -116,10 +116,11 @@ struct fatcache {
  * cache is probably pretty worthless if a file is opened by multiple
  * processes.
  */
-#defineFC_SIZE 2   /* number of entries in the cache */
+#defineFC_SIZE 3   /* number of entries in the cache */
 #defineFC_LASTMAP  0   /* entry the last call to pcbmap() 
resolved
 * to */
 #defineFC_LASTFC   1   /* entry for the last cluster in the 
file */
+#defineFC_OLASTFC  2   /* entry for the previous last cluster 
*/
 
 #defineFCE_EMPTY   0x  /* doesn't represent an actual 
cluster # */
 
Index: msdosfs/msdosfs_fat.c
===
RCS file: /cvs/src/sys/msdosfs/msdosfs_fat.c,v
retrieving revision 1.22
diff -u -p -r1.22 msdosfs_fat.c
--- msdosfs/msdosfs_fat.c   4 Jul 2011 04:30:41 -   1.22
+++ msdosfs/msdosfs_fat.c   4 Apr 2012 12:20:26 -
@@ -952,6 +952,13 @@ extendfile(struct denode *dep, uint32_t 
return (error);
}
 
+   /*
+* Preserve value for the last cluster before extending the file
+* to speed up further lookups.
+*/
+   fc_setcache(dep, FC_OLASTFC, dep-de_fc[FC_LASTFC].fc_frcn,
+   dep-de_fc[FC_LASTFC].fc_fsrcn);
+
while (count  0) {
/*
 * Allocate a new cluster chain and cat onto the end of the



Re: diff: improving msdosfs write speed for large files

2012-04-04 Thread Kenneth R Westerback
On Wed, Apr 04, 2012 at 03:51:38PM +0200, Mike Belopuhov wrote:
 On Wed, Apr 04, 2012 at 14:42 +0400, Alexander Polakov wrote:
  This is a diff from NetBSD pr.34583:
  http://gnats.netbsd.org/cgi-bin/query-pr-single.pl?number=34583
  
  Quoting the author:
  
  I noticed that when writing large file (hundreds of megabytes)
  to an msdos disk, the writing speed to a file decreases with the
  file length.
  Since I have some experience with messydos filesystems (I wrote
  MSH: for the Amiga) I took a look.
  The obvious suspicion with operations that slow down with the
  length of a file is an excessive traversal of the FAT cluster
  chain. However, there is a cache that caches 2 positions: the
  last cluster in the file, and the most recently looked up one.
  Debugging info showed however that frequent full traversals were
  still made. So, apparently when extending a file and after
  updating the end cluster, the previous end is again needed.
  Adding a 3rd entry in the cache, which keeps the end position
  from just before extending a file.
  This has the desired effect of keeping the write speed constant.
  (What it is that needs that position I have not been able to
  ascertain from the filesystem code; it doesn't seem to make
  sense, actually, to read or write clusters before the original
  EOF. I was hoping to find the place where the cache is trashed
  and rewrite it to get the desired info from it beforehand, so
  that the extra cache entry is again unneeded, but alas.)
  
  While there, I changed 0 to NULL for two pointer arguments of
  extendfile().
  
 
 i agree that this is a great find.  i don't really like the diff though.
 i see no point in introducing this macro.  what do others think?

Agreed. I like this version better.

 Ken

 
 Index: msdosfs/denode.h
 ===
 RCS file: /cvs/src/sys/msdosfs/denode.h,v
 retrieving revision 1.23
 diff -u -p -r1.23 denode.h
 --- msdosfs/denode.h  17 Jul 2010 19:27:07 -  1.23
 +++ msdosfs/denode.h  4 Apr 2012 12:20:23 -
 @@ -116,10 +116,11 @@ struct fatcache {
   * cache is probably pretty worthless if a file is opened by multiple
   * processes.
   */
 -#define  FC_SIZE 2   /* number of entries in the cache */
 +#define  FC_SIZE 3   /* number of entries in the cache */
  #define  FC_LASTMAP  0   /* entry the last call to pcbmap() 
 resolved
* to */
  #define  FC_LASTFC   1   /* entry for the last cluster in the 
 file */
 +#define  FC_OLASTFC  2   /* entry for the previous last cluster 
 */
  
  #define  FCE_EMPTY   0x  /* doesn't represent an actual 
 cluster # */
  
 Index: msdosfs/msdosfs_fat.c
 ===
 RCS file: /cvs/src/sys/msdosfs/msdosfs_fat.c,v
 retrieving revision 1.22
 diff -u -p -r1.22 msdosfs_fat.c
 --- msdosfs/msdosfs_fat.c 4 Jul 2011 04:30:41 -   1.22
 +++ msdosfs/msdosfs_fat.c 4 Apr 2012 12:20:26 -
 @@ -952,6 +952,13 @@ extendfile(struct denode *dep, uint32_t 
   return (error);
   }
  
 + /*
 +  * Preserve value for the last cluster before extending the file
 +  * to speed up further lookups.
 +  */
 + fc_setcache(dep, FC_OLASTFC, dep-de_fc[FC_LASTFC].fc_frcn,
 + dep-de_fc[FC_LASTFC].fc_fsrcn);
 +
   while (count  0) {
   /*
* Allocate a new cluster chain and cat onto the end of the