Here is some feedback of 2.4.13+the patch from Chris. Two tests: local
fs and NFS. I still see odd things happening at files above 300G. This
is reiserfs formatted -v2 (3.6) with a default mount (tail).

Linux box is 512M RAM. The files fit in buffer cache. I run the bonnies
of size n+50. Results are fairly constant until I hit about 300M. Then
they seem to fall off an edge. I'll include only two sample points for
each setup.

LOCAL DISK TESTING:

/home/ewhiting> bonnie -s 100
Writing with putc()...  done:   9660 kB/s  99.8 %CPU
Rewriting...            done: 124613 kB/s  99.8 %CPU
Writing intelligently...done:  68415 kB/s  99.6 %CPU

/home/ewhiting> bonnie -s 300
Writing with putc()...  done:   9185 kB/s  95.4 %CPU
Rewriting...            done:  20239 kB/s  19.1 %CPU
Writing intelligently...done:  38978 kB/s  58.1 %CPU

Those rewriting and block write numbers are constant across file sizes
until you hit about 300M.



NFS DISK TESTING: 
Server -- Linux/reiserfs/2.4.13+anne/chris patch. 
Client -- Solaris 8 with latest NFS patches

I notice that the network activity graph seems to 'stall' ocassionally
during the 300M writes. It does not 'stall' on the 100 or 200M writes.
Here again you see the speed drop off on rewrite and block writes.

bash$ bonnie -s 100
Writing with putc()...  done:   8074 kB/s  58.6 %CPU
Rewriting...            done:   6877 kB/s  15.1 %CPU
Writing intelligently...done:  11287 kB/s  19.7 %CPU

bash$ bonnie -s 300
Writing with putc()...  done:   7795 kB/s  54.9 %CPU
Rewriting...            done:   4616 kB/s   9.4 %CPU
Writing intelligently...done:   5655 kB/s  29.8 %CPU

NFS is setup with multiple writers. I tested with one nfsd process and
got similar results.

eric



Chris Mason wrote:
> 
> On Tuesday, October 23, 2001 02:19:57 PM -0400 Anne Milicia 
><[EMAIL PROTECTED]> wrote:
> 
> [ great analysis of fragmentation problem + fix ]
> >
> > So, my question is can journal_mark_freed() be safely skipped when
> > reiserfs_free_block() is called by __discard_prealloc()?  Can you think
> > of any problems with making this change?
> >
> 
> Ok, this fragmentation problem is triggered by any programs that open,
> append, close the same file frequently enough that it happens more than once
> inside a transaction (5 seconds).  This is because preallocated blocks are
> discarded on file close.
> 
> So, I wrote a program that did this X number of times in a loop, and
> got similar results to Anne's.  I reworked her patch slightly to
> keep the interface to reiserfs_free_block the same, but I did keep
> the change to __discard_prealloc.
> 
> In her code, __discard_prealloc preseves the next block number to be preallocated, 
>greating increasing the chance this block will be chosen
> again if we do allocation before the inode goes out of cache.
> 
> Normally find_tag does this for us, but there are a few corner cases
> (mostly with holes) where it can't.
> 
> Anyway, Anne, could you please take a look and make sure this still
> improves your performance?  I think the odd results you got for 2.4.12
> before were probably due to actual fragmentation against prellocated
> blocks from other files.  With a single writer, 2.4.13 allocates blocks
> in order during the open, append, close loop for me.
> 
> thanks,
> Chris
> 
> --- 1.11/fs/reiserfs/bitmap.c Wed, 24 Oct 2001 09:50:17 -0400
> +++ 2413.10/fs/reiserfs/bitmap.c Thu, 25 Oct 2001 14:07:55 -0400
> @@ -84,7 +84,7 @@
>     to free a list of blocks at once. -Hans */
>                                 /* I wonder if it would be less modest
>                                     now that we use journaling. -Hans */
> -void reiserfs_free_block (struct reiserfs_transaction_handle *th, unsigned long 
>block)
> +static void _reiserfs_free_block (struct reiserfs_transaction_handle *th, unsigned 
>long block)
>  {
>      struct super_block * s = th->t_super;
>      struct reiserfs_super_block * rs;
> @@ -92,18 +92,12 @@
>      struct buffer_head ** apbh;
>      int nr, offset;
> 
> -  RFALSE(!s, "vs-4060: trying to free block on nonexistent device");
> -  RFALSE(is_reusable (s, block, 1) == 0, "vs-4070: can not free such block");
> -
>    rs = SB_DISK_SUPER_BLOCK (s);
>    sbh = SB_BUFFER_WITH_SB (s);
>    apbh = SB_AP_BITMAP (s);
> 
>    get_bit_address (s, block, &nr, &offset);
> 
> -  /* mark it before we clear it, just in case */
> -  journal_mark_freed(th, s, block) ;
> -
>    reiserfs_prepare_for_journal(s, apbh[nr], 1 ) ;
> 
>    /* clear bit for the given block in bit map */
> @@ -122,7 +116,26 @@
>    s->s_dirt = 1;
>  }
> 
> +void reiserfs_free_block (struct reiserfs_transaction_handle *th,
> +                          unsigned long block) {
> +    struct super_block * s = th->t_super;
> +
> +    RFALSE(!s, "vs-4061: trying to free block on nonexistent device");
> +    RFALSE(is_reusable (s, block, 1) == 0, "vs-4071: can not free such block");
> +    /* mark it before we clear it, just in case */
> +    journal_mark_freed(th, s, block) ;
> +    _reiserfs_free_block(th, block) ;
> +}
> +
> +/* preallocated blocks don't need to be run through journal_mark_freed */
> +void reiserfs_free_prealloc_block (struct reiserfs_transaction_handle *th,
> +                          unsigned long block) {
> +    struct super_block * s = th->t_super;
> 
> +    RFALSE(!s, "vs-4060: trying to free block on nonexistent device");
> +    RFALSE(is_reusable (s, block, 1) == 0, "vs-4070: can not free such block");
> +    _reiserfs_free_block(th, block) ;
> +}
> 
>  /* beginning from offset-th bit in bmap_nr-th bitmap block,
>     find_forward finds the closest zero bit. It returns 1 and zero
> @@ -649,11 +662,13 @@
>  static void __discard_prealloc (struct reiserfs_transaction_handle * th,
>                                 struct inode * inode)
>  {
> +  unsigned long search_start = inode->u.reiserfs_i.i_prealloc_block ;
>    while (inode->u.reiserfs_i.i_prealloc_count > 0) {
> -    reiserfs_free_block(th,inode->u.reiserfs_i.i_prealloc_block);
> +    reiserfs_free_prealloc_block(th,inode->u.reiserfs_i.i_prealloc_block);
>      inode->u.reiserfs_i.i_prealloc_block++;
>      inode->u.reiserfs_i.i_prealloc_count --;
>    }
> +  inode->u.reiserfs_i.i_prealloc_block = search_start ;
>    list_del (&(inode->u.reiserfs_i.i_prealloc_list));
>  }
>

Reply via email to