Re: A dbs+ problem

2013-09-09 Thread Alexander Burger
On Mon, Sep 09, 2013 at 09:25:54PM +0700, Henrik Sarvell wrote:
> Aha ok so the file grew by accident to become something that is not really
> best practice?

Well, it can also be "best practice", if you like. Just don't worry too
much about it. For an application with so many classes and indexes as in
that example, it may make sense to create more files. It is difficult to
give a clear rule.


> Anyway your template example pointed the way for me, I simply "pad" my dbs
> E/R like so:
> 
> (dbs
>(3 +Page #1
> ...
>(5) #13
>(6)) #14
> 
> And then in the project building on the framework:
> 
> (dbs+ 15
>(3 +Blog)
>(3 (+Blog id url hline content pdate)) )

Yes, perfect. So you have room for later extensions.

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: A dbs+ problem

2013-09-09 Thread Henrik Sarvell
Aha ok so the file grew by accident to become something that is not really
best practice?

Anyway your template example pointed the way for me, I simply "pad" my dbs
E/R like so:

(dbs
   (3 +Page #1
  +PageAttr
  +Block
  +BlockAttr
  +Menu
  +MenuAttr
  +AclGroup
  +AclPerm
  +AclUserLink
  +AclPermLink
  +Lang
  +LocStr
  +LocCache
  +Sess )
   (4 +User)#2
   (4 (+User id username))  #3
   (3 (+Page id name path parent)   #4
  (+Block id page type name loc parent)
  (+BlockAttr id block name value)
  (+PageAttr id page name value)
  (+Menu id name parent page pos)
  (+MenuAttr id menu name value)
  (+AclGroup id name)
  (+AclPerm id name)
  (+AclUserLink id user acl)
  (+AclPermLink id perm acl) )
   (4 (+LocStr id name lang val reqTime) #5
  (+LocCache id page str)
  (+Lang id lang))
   (4 (+Sess id user sid data exp) ) #6
   (3) #7
   (4) #8
   (5) #9
   (6) #10
   (3) #11
   (4) #12
   (5) #13
   (6)) #14

And then in the project building on the framework:

(dbs+ 15
   (3 +Blog)
   (3 (+Blog id url hline content pdate)) )

I have tested to later add something in for instance file #7.

Works.



On Mon, Sep 9, 2013 at 2:34 AM, Alexander Burger wrote:

> Hi Henrik,
>
> > So you recommend 10 files in total, at the begining, no need for more, it
> > might even be suboptimal with more, yet at the end your example has a lot
> > more files than 10.
> >
> > I'm confused?
>
> Sorry, my mail was a bit contradictory.
>
> What I meant is that I usually start with the short one. I always use it
> for a template for new projects. In the initial phase, when the database
> is still changing a lot, this number may increase a bit, but I delete
> and recreate the DB all the time anyway (populating it from a file
> usually called "xxx/init.l").
>
> The long example I posted to show you that you don't need to put a
> single entity or a single index into a single file. The history of
> _that_ 'dbs' call is different, it is from an application running since
> 2001 which was extended and re-organized several times meanwhile (the
> first version was even in the old single-file style of PicoLisp).
>
> ♪♫ Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>


Re: A dbs+ problem

2013-09-08 Thread Alexander Burger
Hi Henrik,

> So you recommend 10 files in total, at the begining, no need for more, it
> might even be suboptimal with more, yet at the end your example has a lot
> more files than 10.
> 
> I'm confused?

Sorry, my mail was a bit contradictory.

What I meant is that I usually start with the short one. I always use it
for a template for new projects. In the initial phase, when the database
is still changing a lot, this number may increase a bit, but I delete
and recreate the DB all the time anyway (populating it from a file
usually called "xxx/init.l").

The long example I posted to show you that you don't need to put a
single entity or a single index into a single file. The history of
_that_ 'dbs' call is different, it is from an application running since
2001 which was extended and re-organized several times meanwhile (the
first version was even in the old single-file style of PicoLisp).

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: A dbs+ problem

2013-09-08 Thread Henrik Sarvell
>
> For example, when I have a new class of entities where I estimate that
> the these are relatively large objects, I add them to 'D'. Creating a
> new file instead of that won't have any performance advantage, because
> access to objects is done with 'lseek' and thus the size of the files
> doesn't matter.
>
> Opening too many files might even have disadvantages, due to the number
> of open file descriptors, and may have hardly-to-predict dis/advantages
> due to worse/better disk cache efficiency.
>

So you recommend 10 files in total, at the begining, no need for more, it
might even be suboptimal with more, yet at the end your example has a lot
more files than 10.

I'm confused?




On Sun, Sep 8, 2013 at 3:36 PM, Alexander Burger wrote:

> On Sun, Sep 08, 2013 at 02:49:31PM +0700, Henrik Sarvell wrote:
> > At the same time projects that rely on the framework need to be able to
> do
> > the same thing without being screwed if the framework needs a new "slot".
>
> As I wrote, allocating a new "slot" not a problem. In fact, this is
> normally happening quite frequently in a typical application.
>
> Still: Is it normally not necessary to extend the number of files each
> time new class or index is added.
>
> Usually, I start up with a skeleton like:
>
># Database sizes
>(dbs
>   (3 +Role +User)  # 512 Prevalent objects
>   (0 ) # A:64 Tiny objects
>   (1 (+User pw))   # B:128 Small objects
>   (2 ) # C:256 Normal objects
>   (4 ) # D:1024 Large objects
>   (6 ) # E:4096 Huge objects
>   (2 (+Role nm) (+User nm))# F:256 Small indexes
>   (4 ) # G:1024 Normal indexes
>   (6 ) )   # H:4096 Large indexes
>
> and then, whenever necessary, add new entries to _these_ files.
>
> For example, when I have a new class of entities where I estimate that
> the these are relatively large objects, I add them to 'D'. Creating a
> new file instead of that won't have any performance advantage, because
> access to objects is done with 'lseek' and thus the size of the files
> doesn't matter.
>
> Opening too many files might even have disadvantages, due to the number
> of open file descriptors, and may have hardly-to-predict dis/advantages
> due to worse/better disk cache efficiency.
>
> For a concrete result of that process, see the 'dbs' call at the end of
> this mail [*]. Here we probably already _have_ too many files ;-)
>
>
> Also, keep in mind that this whole issue is not sooo extremely critical,
> as long as you don't put _everything_ into a single file. If objects are
> smaller than the block size, a little space is wasted, and if they are
> larger, they will occupy more than one block, resulting in slightly
> longer load times (which is not critical either, because objects are
> cached anyway).
>
> A little more critical than entities are the indexes. The block size
> should not be too small here, but if you go with all indexes on block
> size 4, you are on the safe side (just wasting _a_little_ space for very
> small indexes). In my experience, index block sizes of more than 6 show
> a decrease in performance, probably because of longer loading times.
>
>
> > If *Dbs is only used to point to locations where new objects are created
> > how does the database logic work for objects that already exist? How does
> > it know to look for X in file Y if it doesn't use the pointers in *Dbs?
>
> The value of '*Dbs' itself is used _only_ in 'pool' (to create or open
> the DB). So this list in *Dbs doesn't hold any pointers, just the
> indicators for the block sizes in the corresponding file. And even these
> are used only once, when the DB is created the first time. After that,
> they are just used as indicators to _open_ the existing files (where the
> block sizes are already kept in the file headers).
>
>
> Objects are located solely by their "name", which encodes the file
> number and the file-offset to the symbol's first block. Once an object
> is created, it won't move (its first block, that is)
>
> ♪♫ Alex
>
>
> *) Example from an existing application (just for illustration, no need
>to understand the details).
>
> # Database sizes
> (dbs
>(4)
>(1 +Note +Leg (+User pw))
>(1 +Role +Jahr +Art +ArtKat +Status +BSped +Abteilung +Inhalt +Land)
>(1 +Kfz +Log +Currency +Team +Steuer +Anlass +Branche +Carrier +Inco
> +Svs)
>(1 +Memo +Besuch +Anr +AbwChk +Abw +Arb)
>(2 +User +Mitarb +Sup +Ort +Dln +Ves +ZollAmt +Lag)
>(3 +Kleb +PrForm)
>(4 +Firma +Entn +Artikel +Schaden +AbhA)
>(4 +TxtKat +MacText +BrfText +prs +PGrp)
>(4 +Aust)
>(4 +Messe)
>(5 +Beleg +PosText)
>(5 +Sendung)
>(5 +Ansp)
>(6 +Person)
>(3
>   (+User nm)
>   (+Role nm)
>   (+Kfz kz)
>   (+Currency key)
>   (+Team key)
>   (+Jahr key)
>

Re: A dbs+ problem

2013-09-08 Thread Alexander Burger
On Sun, Sep 08, 2013 at 02:49:31PM +0700, Henrik Sarvell wrote:
> At the same time projects that rely on the framework need to be able to do
> the same thing without being screwed if the framework needs a new "slot".

As I wrote, allocating a new "slot" not a problem. In fact, this is
normally happening quite frequently in a typical application.

Still: Is it normally not necessary to extend the number of files each
time new class or index is added.

Usually, I start up with a skeleton like:

   # Database sizes
   (dbs
  (3 +Role +User)  # 512 Prevalent objects
  (0 ) # A:64 Tiny objects
  (1 (+User pw))   # B:128 Small objects
  (2 ) # C:256 Normal objects
  (4 ) # D:1024 Large objects
  (6 ) # E:4096 Huge objects
  (2 (+Role nm) (+User nm))# F:256 Small indexes
  (4 ) # G:1024 Normal indexes
  (6 ) )   # H:4096 Large indexes

and then, whenever necessary, add new entries to _these_ files.

For example, when I have a new class of entities where I estimate that
the these are relatively large objects, I add them to 'D'. Creating a
new file instead of that won't have any performance advantage, because
access to objects is done with 'lseek' and thus the size of the files
doesn't matter.

Opening too many files might even have disadvantages, due to the number
of open file descriptors, and may have hardly-to-predict dis/advantages
due to worse/better disk cache efficiency.

For a concrete result of that process, see the 'dbs' call at the end of
this mail [*]. Here we probably already _have_ too many files ;-)


Also, keep in mind that this whole issue is not sooo extremely critical,
as long as you don't put _everything_ into a single file. If objects are
smaller than the block size, a little space is wasted, and if they are
larger, they will occupy more than one block, resulting in slightly
longer load times (which is not critical either, because objects are
cached anyway).

A little more critical than entities are the indexes. The block size
should not be too small here, but if you go with all indexes on block
size 4, you are on the safe side (just wasting _a_little_ space for very
small indexes). In my experience, index block sizes of more than 6 show
a decrease in performance, probably because of longer loading times.


> If *Dbs is only used to point to locations where new objects are created
> how does the database logic work for objects that already exist? How does
> it know to look for X in file Y if it doesn't use the pointers in *Dbs?

The value of '*Dbs' itself is used _only_ in 'pool' (to create or open
the DB). So this list in *Dbs doesn't hold any pointers, just the
indicators for the block sizes in the corresponding file. And even these
are used only once, when the DB is created the first time. After that,
they are just used as indicators to _open_ the existing files (where the
block sizes are already kept in the file headers).


Objects are located solely by their "name", which encodes the file
number and the file-offset to the symbol's first block. Once an object
is created, it won't move (its first block, that is)

♪♫ Alex


*) Example from an existing application (just for illustration, no need
   to understand the details).

# Database sizes
(dbs
   (4)
   (1 +Note +Leg (+User pw))
   (1 +Role +Jahr +Art +ArtKat +Status +BSped +Abteilung +Inhalt +Land)
   (1 +Kfz +Log +Currency +Team +Steuer +Anlass +Branche +Carrier +Inco +Svs)
   (1 +Memo +Besuch +Anr +AbwChk +Abw +Arb)
   (2 +User +Mitarb +Sup +Ort +Dln +Ves +ZollAmt +Lag)
   (3 +Kleb +PrForm)
   (4 +Firma +Entn +Artikel +Schaden +AbhA)
   (4 +TxtKat +MacText +BrfText +prs +PGrp)
   (4 +Aust)
   (4 +Messe)
   (5 +Beleg +PosText)
   (5 +Sendung)
   (5 +Ansp)
   (6 +Person)
   (3
  (+User nm)
  (+Role nm)
  (+Kfz kz)
  (+Currency key)
  (+Team key)
  (+Jahr key)
  (+Steuer txt)
  (+ZollAmt nm)
  (+Anlass txt)
  (+Anr nm)
  (+PGrp nm prs)
  (+Ves nm ets)
  (+PrForm nm)
  (+Land de en es) )
   (4 (+Leg pos ansp dat mes))
   (4
  (+Kfz mit)
  (+Mitarb sb plz ort tel fax mob mail)
  (+AbwChk nr txt)
  (+Abw ter)
  (+Entn snd)
  (+Arb nr mes snd)
  (+Lag nr mes snd)
  (+Schaden pos btg mit dat kw) )
   (4 (+Note nm act kw) (+Log mit) (+Mitarb name) (+Memo dat mit))
   (3 (+Entn nr) (+Sup mc) (+Branche key) (+Carrier key) (+Inco key))
   (4 (+Sup nm plz ort tel fax mob))
   (4 (+Entn nm txt dat tim kfz))
   (4 (+Artikel nm))
   (4 (+Artikel sup mc kat))
   (4 (+Ort nm lnd))
   (4 (+Ort cont exp imp))
   (4 (+Branche txt) (+Carrier nm) (+Inco nm))
   (4 (+Besuch dat txt))
   (4 (+Person nr nr2))
   (4
  (+Kunde nm)
  (+Partner nm)
  (+Btg nm)
  (+Dfg nm)
  (+Org nm)
  (+Dtv nm)
  (+LkwU nm)
  (

Re: A dbs+ problem

2013-09-08 Thread Henrik Sarvell
The framework will be changing in perpetuity and it cares about performance
so being able to specify files is a given.

At the same time projects that rely on the framework need to be able to do
the same thing without being screwed if the framework needs a new "slot".

If *Dbs is only used to point to locations where new objects are created
how does the database logic work for objects that already exist? How does
it know to look for X in file Y if it doesn't use the pointers in *Dbs?

I'm trying to understand why no gaps in the file numbers are allowed and if
it is possible or how much work it would be to make it so?





On Sun, Sep 8, 2013 at 2:09 PM, Alexander Burger wrote:

> Hi Henrik,
>
> > number than the dbs call has resulted in, thus causing a "gap", it seems
> > like this is a no no.
>
> Right. The file numbers must be continuous.
>
>
> > (dbs
> >(3 +Server)
> >(3 (+Server ip)))
> >
> > (dbs+ 100
> >(4 +Entry)
> >(4 (+Entry tag)) )
> > ...
> > Works fine if 100 is changed to 3 though.
> >
> > This is causing problems for me with the extended framework I'm working
> on,
> > the framework is loading its own E/R structure with a dbs call. The
> thought
> > here is that projects that use the framework will first load it and its
> > E/R. Then the project code will add its own E/R through a dbs+ call.
>
> This is fine, and the reason for 'dbs+'.
>
> However, the data structures created by 'dbs' and 'dbs+' are mainly used
> for creating _new_ objects, and have no influence on existing objects,
> so that changing them for existing databases is not a very good idea and
> may cause extensive (re)work (as I explained in my last mail).
>
>
> > The crux is that since the framework will always be under development it
> > might need more database numbers, I was hoping that I could pick some
> > arbitrarily high number like 100 as a rule for projects to use in their
> > dbs+ calls to ensure that there would never be a collision.
>
> As long as the framework is under heavy development, I would not care
> too much about '*Dbs', since performance (optimal distribution of
> objects across DB files) is not an issue.
>
>
> Still you can keep it dynamic in a convenient way, if you keep in mind
> the above restrictions and caveats.
>
> I would write:
>
>(dbs
>   ... )
>
>...
>
>(dbs+ (inc (length *Dbs))
>   ... )
>
> ♪♫ Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>


Re: A dbs+ problem

2013-09-08 Thread Alexander Burger
Hi Henrik,

> number than the dbs call has resulted in, thus causing a "gap", it seems
> like this is a no no.

Right. The file numbers must be continuous.


> (dbs
>(3 +Server)
>(3 (+Server ip)))
> 
> (dbs+ 100
>(4 +Entry)
>(4 (+Entry tag)) )
> ...
> Works fine if 100 is changed to 3 though.
> 
> This is causing problems for me with the extended framework I'm working on,
> the framework is loading its own E/R structure with a dbs call. The thought
> here is that projects that use the framework will first load it and its
> E/R. Then the project code will add its own E/R through a dbs+ call.

This is fine, and the reason for 'dbs+'.

However, the data structures created by 'dbs' and 'dbs+' are mainly used
for creating _new_ objects, and have no influence on existing objects,
so that changing them for existing databases is not a very good idea and
may cause extensive (re)work (as I explained in my last mail).


> The crux is that since the framework will always be under development it
> might need more database numbers, I was hoping that I could pick some
> arbitrarily high number like 100 as a rule for projects to use in their
> dbs+ calls to ensure that there would never be a collision.

As long as the framework is under heavy development, I would not care
too much about '*Dbs', since performance (optimal distribution of
objects across DB files) is not an issue.


Still you can keep it dynamic in a convenient way, if you keep in mind
the above restrictions and caveats.

I would write:

   (dbs
  ... )

   ...

   (dbs+ (inc (length *Dbs))
  ... )

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe


Re: A dbs+ problem

2013-09-07 Thread Henrik Sarvell
After some testing I think the basic problem is that I try to use a higher
number than the dbs call has resulted in, thus causing a "gap", it seems
like this is a no no.

A small test:

(load "dbg.l")

(class +Server +Entity)
(rel ip(+Key +String))

(class +Entry +Entity)
(rel tag   (+Ref +String))

(dbs
   (3 +Server)
   (3 (+Server ip)))

(dbs+ 100
   (4 +Entry)
   (4 (+Entry tag)) )

(pool "/opt/picolisp/projects/test/db/" *Dbs)

(new! '(+Server) 'ip "an ip")
(new! '(+Entry) 'tag "a tag")

(mapc show (collect 'ip '+Server))
(mapc show (collect 'tag '+Entry))

Even with an empty/fresh db dir the above results in:
!? (pass new (or (meta "Typ" 'Dbf 1) 1) "Typ")
Bad DB file

Works fine if 100 is changed to 3 though.

This is causing problems for me with the extended framework I'm working on,
the framework is loading its own E/R structure with a dbs call. The thought
here is that projects that use the framework will first load it and its
E/R. Then the project code will add its own E/R through a dbs+ call.

The crux is that since the framework will always be under development it
might need more database numbers, I was hoping that I could pick some
arbitrarily high number like 100 as a rule for projects to use in their
dbs+ calls to ensure that there would never be a collision.

Any suggestions on how this conundrum can be resolved?








On Sat, Sep 7, 2013 at 1:56 PM, Alexander Burger wrote:

> Hi Henrik,
>
> > Hi, I'm in a bind, I'm using dbs+ with say 7, suddenly I find that I need
> > to increase the number to something more to make more room for new stuff
> in
> > the "base" ER so to speak.
> >
> > Simply changing the number doesn't work, it gives me "Bad DB file" which
> is
> > quite understandable.
> >
> > How can I resolve this without losing any data?
>
> This is a tough one. A lot of database objects must be moved to
> different files.
>
> I would, in general, not recommend that, and rather try to extend the DB
> "horizontally", i.e. by putting new classes and indexes into existing
> 'dbs' entries (files).
>
>
> But if absolutely necessary, it can be done. I see two ways:
>
> 1. Completely rebuild the DB. For that, export all entities:
>
>(load "@lib/too.l")  # for 'dump'
>
>(out "myData.l"
>   (prinl "# " (stamp))
>   (prinl)
>   (prinl "# Roles")
>   (dump (db nm +Role @@))
>   (println '(commit))
>   (prinl)
>   (prinl "# User")
>   (dump (db nm +User @@))
>   (println '(commit))
>   (prinl)
>   (prinl "# SomeClass")
>   (dump (db key +Cls1 @@))
>   (println '(commit))
>   (prinl)
>   (prinl "# SomeOtherClass")
>   (dump (db key +Cls2 @@))
>   (println '(commit))
>   (prinl)
>   ... )
>
>You must find proper Pilog expressions to select all entities. This
>depends on the class(es) and the indexes involved. To put it simply,
>you must find an index for each class that is complete, i.e. which
>indexes all objects of that class.
>
>Take care not to forget anything :)
>
>The resulting file can be imported into the newly structured, but
>still empty, DB with
>
>   : (load "myData.l")
>
>This works rather well, but requires thorough testing and checking of
>the new DB.
>
>
> 2. There is a possibility to re-organize an existing DB. This can be
>done with 'dbfMigrate' in "@lib/too.l". At least theoretically ;-)
>
>I haven't yet used it in that way. But I used it rather often to port
>databases from pil32 to pil64 format, which involves similar
>problems, and it always worked reliably.
>
>So you might give it a try (after backing up your DB, of course). When
>you changed '*Dbs' (i.e. the 'dbs' and 'dbs+' calls), you open the DB
>as before
>
>   : (pool "db/xxx/" *Dbs)
>
>and then cross your fingers and call
>
>   : (dbfMigrate "db/xxx/" *Dbs)
>
>Also, you should perform the usual DB check
>
>   : (dbCheck)
>
>I hope I didn't forget anything ;-)
>
> ♪♫ Alex
> --
> UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe
>


Re: A dbs+ problem

2013-09-07 Thread Alexander Burger
Hi Henrik,

> Hi, I'm in a bind, I'm using dbs+ with say 7, suddenly I find that I need
> to increase the number to something more to make more room for new stuff in
> the "base" ER so to speak.
> 
> Simply changing the number doesn't work, it gives me "Bad DB file" which is
> quite understandable.
> 
> How can I resolve this without losing any data?

This is a tough one. A lot of database objects must be moved to
different files.

I would, in general, not recommend that, and rather try to extend the DB
"horizontally", i.e. by putting new classes and indexes into existing
'dbs' entries (files).


But if absolutely necessary, it can be done. I see two ways:

1. Completely rebuild the DB. For that, export all entities:

   (load "@lib/too.l")  # for 'dump'

   (out "myData.l"
  (prinl "# " (stamp))
  (prinl)
  (prinl "# Roles")
  (dump (db nm +Role @@))
  (println '(commit))
  (prinl)
  (prinl "# User")
  (dump (db nm +User @@))
  (println '(commit))
  (prinl)
  (prinl "# SomeClass")
  (dump (db key +Cls1 @@))
  (println '(commit))
  (prinl)
  (prinl "# SomeOtherClass")
  (dump (db key +Cls2 @@))
  (println '(commit))
  (prinl)
  ... )

   You must find proper Pilog expressions to select all entities. This
   depends on the class(es) and the indexes involved. To put it simply,
   you must find an index for each class that is complete, i.e. which
   indexes all objects of that class.

   Take care not to forget anything :)

   The resulting file can be imported into the newly structured, but
   still empty, DB with

  : (load "myData.l")

   This works rather well, but requires thorough testing and checking of
   the new DB.


2. There is a possibility to re-organize an existing DB. This can be
   done with 'dbfMigrate' in "@lib/too.l". At least theoretically ;-)

   I haven't yet used it in that way. But I used it rather often to port
   databases from pil32 to pil64 format, which involves similar
   problems, and it always worked reliably.

   So you might give it a try (after backing up your DB, of course). When
   you changed '*Dbs' (i.e. the 'dbs' and 'dbs+' calls), you open the DB
   as before

  : (pool "db/xxx/" *Dbs)

   and then cross your fingers and call

  : (dbfMigrate "db/xxx/" *Dbs)

   Also, you should perform the usual DB check

  : (dbCheck)

   I hope I didn't forget anything ;-)

♪♫ Alex
-- 
UNSUBSCRIBE: mailto:picolisp@software-lab.de?subject=Unsubscribe