[EMAIL PROTECTED] wrote:
> 
> Hi all.  I have a need to stretch a file (rapidly) to a new
> allocated size, but do not want to actually write data by
> the current machine at the current time (I just need there to
> be allocated blocks).
> 
> Take a look at the function i wrote below. This works. But
> i want something that is faster.  Once the file is fairly large
> (say 100 Mbytes) and i need to add another 100MB, the
> function gets slower, and slower, and slower...
> 
> I suppose I could start at the last file-size (and hope there
> are no holes prior). This would help.  But i think the big
> killer is that ext2getblk() function is bothering to do a bunch
> of memory allocation that i will never need.
> 
> Can anyone think of a much faster way to write this?  I'm
> will to go pretty deep and write some ext2 specific functions,
> as long as i know ahead of time the gotcha's and things to
> look out for.
> 
> I'm also willing to pay for help....
> 
> static int extendAllocation(int userModeFD, long newAllocation)
>  {
>   long fileBlks = 0,  i;
>   struct buffer_head * bh = NULL;
>   struct file *filp = fget(userModeFD);
>   struct inode *inode = NULL;
> 
>   if (filp == NULL) return -EINVAL;
>   inode = filp->f_dentry->d_inode;
>   fileBlks = (newAllocation + (inode->i_blksize - 1)) / 4096;
> 
>   lock_kernel();
>   for (i=inode->i_size/inode->i_blksize ; i<fileBlks; i++) {           /* loop 
>through file blocks */
>     bh = ext2_getblk((unsigned long)inode,i,1,&res);
>     if (!bh)  break;
>     brelse(bh);
>   }
>   fput(filp);
>   unlock_kernel();
>   return 0;
> }

Caveat: see Stephen's warning, you shouldn't expect the slightest degree
of privacy on a filesystem running this code.  You would have to make
sure that 1) only users with appropriate access rights can mount or read
the filesystem or 2) there is no chance of anything sensitive getting
onto the filesystem.  (These amount to the same thing.)

All the same, it's interesting to know why the allocation slows down as
the file grows.  Possible reasons I can think of are:

  - Make sure you're using 4K block size

  - This code constantly hashes into the buffer cache.  Could hash
collisions be slowing things down?  (I doubt it.)

  - The buffer list is getting longer and longer - try bforget(bh)
instead of brelse(bh).  The long buffer list causes bdflush to spend
longer scanning it - it's n^2 behaviour.  This is a possible offender.

  - As soon as you pass 4 meg ext2 starts using double indirect blocks -
this probably doesn't cost much, and you don't go to triple-indirect
until 4 Gig - do you see another slowdown there?

  - Seeking - I haven't dug into this, but are the index blocks being
allocated near the data blocks as they should?  How often is the inode
being updated?

  - How much memory do you have?

  - Is your disk nearly full for these test, resulting in longer and
longer free block search times, plus more seeking?

  - Is your CPU usage going up as you get into the longer allocations? 
If not then the slowdown must probably is due to extra seeking, with a
second suspect being redundant updates of metadata.  It would be nice to
know what your cpu is doing.

  - Oh wait a sec.  As you allocate more and more you *are* going to
have to keep seeking back to the metadata to update it.  This could be
the culprit.

I'm sure Stephen and others can add to this list.

You can see here that ext2 allocate blocks one at a time, no matter what
you do (the loop by n just descends through the index tree):

  
http://innominate.org/~graichen/projects/lxr/source/fs/ext2/inode.c?v=v2.3#L375
   static int ext2_alloc_branch(struct inode *inode,

This means there's a limit on how much you can improve the situation
before you have to start hacking ext2.

--
Daniel
-
To unsubscribe from this list: send the line "unsubscribe linux-fsdevel" in
the body of a message to [EMAIL PROTECTED]

Reply via email to