Hi Laurent,
> As I like to do something useful when I learn a new language, I
> decided to do a backup system.
OK.
> As you can see, given the progression, it will soon take more time
> to store the meta-data than processing the data itself.
> Does anybody see if I made a mistake in my code?
Yes, the main problem is this +Joint:
> (class +File +Entity)
> ...
> (rel backups (+List +Joint) files (+Backup)) #
>
> (class +Backup +Entity)
> ...
> (rel files (+List +Joint) backups (+File)) #
It causes each time a +File is created to be the list of files
in the +Backup object to be extended:
> (de addFile (Bk P)
> ...
> (put!>
> o
> 'backups
> (append (; 'o backups) Bk) )
Thus, the single +Backup object gets larger and larger.
As a general rule, you can always use '+Joint' and '+Ref +Link'
interchangeably. You want to use a list of '+Joint's only if the list
doesn't get too long (less than, say, 100).
So the most important change is to remove the line
(rel files (+List +Joint) backups (+File))
from the +Backup class, and use
(rel backups (+List +Ref +Link) NIL (+Backup))
in the +File class. This will increase the speed dramatically.
After that, there are a few more places that should be optimized. You'll
notice the differences only when you create a lot more objects.
In the specification of the database block sizes
> (dbs
> (0)
> (0 +Chunk)
> (0 +File)
> (0 (+File pth inode))
> (4 +Backup)
> (0 (+Backup name startDT hostName)) )
the '0's mean that the block size is 64. This is a bit to small for
'+File' and, more important, for the index trees. I would use '2' for
the '+File' and '+Backup' objects, and '4' for the indexes. This gives:
(2 +File)
(4 (+File pth inode))
(2 +Backup)
(4 (+Backup name startDT hostName)) )
Then, the usage of 'request' is not as intended
> (request '(+File)
> 'pth P
> 'size 0
> 'accessTime 1
> 'modifyTime 2
> 'changeTime 3
> 'inode 4
> 'links 1
> 'uid 1001
> 'gid 1002
> 'accessRights 766 )
'request' searches with the given keys for an object, before it decides
whether to use an existing object or to create a new one.
So, typically, if 'pth' is a characteristic key, it would be called as
(let Obj (request '(+File) 'pth P)
(put> Obj ..)
...
However, I suspect that 'request' is not needed here at all, as you
create new +File objects. So 'new' is the way:
(de addFile (Bk P)
(let Obj
(new (db: +File) '(+File)
'pth P
'size 0
'accessTime 1
'modifyTime 2
'changeTime 3
'inode 4
'links 1
'uid 1001
'gid 1002
'accessRights 766 )
## Not necessary (put> Obj 'backups (append (; Obj backups) Bk))
(put> Obj 'backups Bk)
(at (0 . 10000) (commit))
Obj ) )
Note two other changes I made:
- Because 'backups' is a +List relation, the explicit 'append' is not
necessary. Just putting 'Bk' is enough, the list will be created
automatically.
- Calling 'new!', 'put!>' etc., i.e. the functions which call
(dbSync) and then (commit) each time they are called, is very
expensive. For a large-volume input it is better to go into
single-user mode of the DB, avoid (dbSync), and call 'commit' less
often. In the example above, it is called only every 10000th time.
The same applies to the backup function
> (de Backup (rootPath)
> (let obj1
> (request '(+Backup)
> 'name (stamp)
> 'startDT (stamp)
> 'basePath rootPath
> 'hostName (host "localhost") )
> (put! *DB 'currentBackup obj1)
> # now, walk the path
> (let Dir rootPath
> (recur (Dir)
> (for F (dir Dir)
> (let Path (pack Dir "/" F)
> (addFile obj1 Path)
> # note: change this test: it considers a link to a dir as a
> dir!
> (if (=T (car (info Path)))
> (recurse Path) ) ) ) ) )
> (length (get obj1 'files)) ) )
Avoiding 'request' and 'put!' gives basically
(de backup (RootPath)
(let Obj1
(new (db: +Backup) '(+Backup)
'name (stamp)
'startDT (stamp)
'basePath RootPath
'hostName (host "localhost") )
(put *DB 'currentBackup Obj1)
(commit)
# now, walk the path
(let Dir RootPath
(recur (Dir)
(for F (dir Dir)
(let Path (pack Dir "/" F)
(addFile Obj1 Path)
# note: change this test: it considers a link to a dir as
a dir!
(if (=T (car (info Path)))
(recurse Path) ) ) ) ) )
(commit)
(count (tree 'backups '+File)) ) )
: (bench (backup "/home"))
0.460 sec
-> 7125
Cheers,
- Alex
--
UNSUBSCRIBE: mailto:[email protected]?subject=Unsubscribe