Re: Changes in SQL backend design

Philipp Marek Tue, 13 Apr 2010 23:07:36 -0700

Hello Jan!

On Dienstag, 13. April 2010, Jan Horák wrote:
> Dne 12.4.2010 8:02, Philipp Marek napsal(a):
> > Sorry for the delay; but reading the thread "Severe performance issues
> > with large directories" I just remembered that the backend has a little
> > bit of a problem with big directories - storage overhead.
> >
> > Do you see any way to split directories into a series of blocks (like
> > files are done), and when changing only a few of the files using pointers
> > to the unmodified blocks of the old directory?
> >
> > I don't propose a real delta design - that was too slow, IIRC.
> > Just re-use of directory blocks; that shouldn't bring any performance
> > issues.
> >
> >
> > Is there some way to do that? Perhaps multiple "." entries in a
> > directory, which just point to other parts?
> I'm not sure if we think the same issue, but I was thinking about a kind
> of hash table. Using
> a sophisticated table size it could bring good results, supposedly.
Sorry, I didn't make myself clear.


I didn't find the issue I'm talking about in the issue tracker; but the 
problem is that the backends (FSFS, BDB) don't store directories deltified 
(for performance reasons), and so modifying an entry in or below a big 
directory has to re-write the whole directory - and that means several 
megabytes, for big directories.


So I'd suggest to change the directory storage.
* Either use a new table, with fields like parent, name (or path),
  valid-from-revision, valid-before-revision or something like that;
  then changing an entry means only updating valid-before of the 
  old record, and inserting a new one.
* Or, if you want to store directories in the same way as file data (like
  now in FSFS and BDB), I'd suggest to limit such blocks of directory data
  to a few KB, but to define an indirect-block that tells which blocks are
  used.
  A new entry could then reference all the unchanged blocks of the older
  revision.


I hope that this explains it a bit better.


Regards,

Phil

Re: Changes in SQL backend design

Reply via email to