date:20010626

[Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Giovanni Maruzzelli


Hello Zopistas,

thank'you all for your replies.

Our doubts still unresolved :-(

With a clever hack that Toby Dickenson made on the very useful tranalyzer,
we was able to see what happen
when we add or catalog an object. (BTW, we don't use CatalogAware).

We can send the output of tranalyzer2 to anyone interested, but in short
words this is
what happens in an empty folder (and I remind you that as the folder get
populated, the size that
is added to each transaction grows, a folder with one hundred objects adds
some 100K):

if we add a normal DTML document (no catalog involved) in an empty folder we
have
a very small increase in size: the size of the dtml and the size of the
folder:

TID: 33D853C2CE6CDBB @ 77396692 obs 2 len 729
By ciao
/aacucu/addDTMLDocument
OID: 40817 len 270 [OFS.Folder.Folder]
OID: 40818 len 309 [OFS.DTMLDocument.DTMLDocument]

if we add an Articolo that's cataloged on the fly in the same empty
directory we have a bloating:

TID: 33D853D722FA167 @ 77397437 obs 96 len 226568
By ciao
/aacucu/Articolo_add
OID: 40817 len 363 [OFS.Folder.Folder]
OID: 40819 len 598 [*ennPsHQQKY5zjxlQs1ebmA==.Articolo]
OID: 407b5 len 8074 [BTrees.IOBTree.IOBucket]
OID: 37aa9 len 39 [BTrees.Length.Length]
OID: 37b95 len 1483 [BTrees.OIBTree.OIBucket]
OID: 407b7 len 1739 [BTrees.IOBTree.IOBucket]
OID: 407b8 len 402 [BTrees.IIBTree.IISet]
OID: 407b9 len 399 [BTrees.IOBTree.IOBucket]
OID: 407ba len 402 [BTrees.IIBTree.IISet]
OID: 407bb len 3497 [BTrees.IOBTree.IOBucket]
OID: 407bc len 5871 [BTrees.OOBTree.OOBucket]
OID: 37ab2 len 39 [BTrees.Length.Length]
OID: 407c6 len 6279 [BTrees.IOBTree.IOBucket]
OID: 3d7bf len 312 [BTrees.IIBTree.IISet]
OID: 407c7 len 4507 [BTrees.IOBTree.IOBucket]
OID: 3c992 len 837 [BTrees.OOBTree.OOBucket]
OID: 37abe len 39 [BTrees.Length.Length]
OID: 407d2 len 696 [BTrees.IOBTree.IOBucket]
OID: 3cb4e len 572 [BTrees.IIBTree.IISet]
OID: 407d3 len 537 [BTrees.IOBTree.IOBucket]
OID: 40809 len 387 [BTrees.IIBTree.IISet]
OID: 407d4 len 507 [BTrees.IOBTree.IOBucket]
OID: 4080a len 387 [BTrees.IIBTree.IISet]
OID: 407d5 len 507 [BTrees.IOBTree.IOBucket]
OID: 4080b len 387 [BTrees.IIBTree.IISet]
OID: 407d6 len 507 [BTrees.IOBTree.IOBucket]
OID: 4080c len 387 [BTrees.IIBTree.IISet]
OID: 407d7 len 339 [BTrees.IOBTree.IOBucket]
OID: 4080d len 382 [BTrees.IIBTree.IISet]
OID: 407d8 len 339 [BTrees.IOBTree.IOBucket]
OID: 4080e len 382 [BTrees.IIBTree.IISet]
OID: 407d9 len 339 [BTrees.IOBTree.IOBucket]
OID: 3d064 len 597 [BTrees.IIBTree.IISet]
OID: 407da len 347 [BTrees.IOBTree.IOBucket]
OID: 4080f len 387 [BTrees.IIBTree.IISet]
OID: 407db len 339 [BTrees.IOBTree.IOBucket]
OID: 3d1ba len 642 [BTrees.IIBTree.IISet]
OID: 407dc len 339 [BTrees.IOBTree.IOBucket]
OID: 40810 len 372 [BTrees.IIBTree.IISet]
OID: 407dd len 339 [BTrees.IOBTree.IOBucket]
OID: 40811 len 372 [BTrees.IIBTree.IISet]
OID: 407de len 339 [BTrees.IOBTree.IOBucket]
OID: 37f11 len 977 [BTrees.IOBTree.IOBucket]
OID: 380de len 830 [BTrees.OIBTree.OIBucket]
OID: 37ac4 len 25537 [BTrees.IIBTree.IISet]
OID: 37ac7 len 9892 [BTrees.IIBTree.IISet]
OID: 37aca len 13947 [BTrees.IIBTree.IISet]
OID: 38922 len 387 [BTrees.IIBTree.IISet]
OID: 38643 len 827 [BTrees.IIBTree.IISet]
OID: 3894c len 92 [BTrees.IIBTree.IISet]
OID: 388ff len 24707 [BTrees.IIBTree.IISet]
OID: 38581 len 277 [BTrees.IIBTree.IISet]
OID: 3d7f7 len 319 [BTrees.IOBTree.IOBTree]
OID: 3d7f8 len 356 [BTrees.IOBTree.IOBTree]
OID: 40812 len 372 [BTrees.IIBTree.IISet]
OID: 407e0 len 339 [BTrees.IOBTree.IOBucket]
OID: 40813 len 387 [BTrees.IIBTree.IISet]
OID: 407e1 len 339 [BTrees.IOBTree.IOBucket]
OID: 40814 len 362 [BTrees.IIBTree.IISet]
OID: 407e2 len 507 [BTrees.IOBTree.IOBucket]
OID: 37eb9 len 981 [BTrees.IOBTree.IOBucket]
OID: 38197 len 804 [BTrees.OIBTree.OIBucket]
OID: 38ac7 len 7947 [BTrees.IIBTree.IISet]
OID: 387f6 len 97 [BTrees.IIBTree.IISet]
OID: 383f7 len 850 [BTrees.OOBTree.OOBucket]
OID: 4081a len 47 [BTrees.IIBTree.IISet]
OID: 38407 len 850 [BTrees.OOBTree.OOBucket]
OID: 4081b len 47 [BTrees.IIBTree.IISet]
OID: 388ac len 92 [BTrees.IIBTree.IISet]
OID: 387d4 len 152 [BTrees.IIBTree.IISet]
OID: 3868c len 152 [BTrees.IIBTree.IISet]
OID: 38681 len 142 [BTrees.IIBTree.IISet]
OID: 388b0 len 72 [BTrees.IIBTree.IISet]
OID: 384f1 len 52 [BTrees.IIBTree.IISet]
OID: 37ca6 len 586 [BTrees.IOBTree.IOBucket]
OID: 4081c len 686 [BTrees.IOBTree.IOBucket]
OID: 37ab8 len 39336 [BTrees.IOBTree.IOBTree]
OID: 381d8 len 594 [BTrees.OIBTree.OIBucket]
OID: 38ac9 len 1252 [BTrees.IIBTree.IISet]
OID: 38770 len 52 [BTrees.IIBTree.IISet]
OID: 37d94 len 1234 [BTrees.IOBTree.IOBucket]
OID: 3821d len 617 [BTrees.OIBTree.OIBucket]
OID: 38acb len 557 [BTrees.IIBTree.IISet]
OID: 38710 len 52 [BTrees.IIBTree.IISet]
OID: 386ac len 52 [BTrees.IIBTree.IISet]
OID: 38409 len 1019 [BTrees.OOBTree.OOBucket]
OID: 4081d len 47 [BTrees.IIBTree.IISet]
OID: 3870b len 52 [BTrees.IIBTree.IISet]
OID: 38403 len 816 [BTrees.OOBTree.OOBucket]
OID: 4081e len 47 [BTrees.IIBTree.IISet]
OID: 387fe len 57 [BTrees.IIBTree.IISet]

[Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough



Hi Giovanni,

How many indexes do you have, what are the index types, and what do they
index?  Likewise, what about metadata?  In your last message, you said
there's about 20.  That's a heck of a lot of indexes.  Do you need them
all?

I can see a potential reason for the problem you explain as and I
remind you that as the folder get populated, the size that is added to
each transaction grows, a folder with one hundred objects adds some
100K... It's true that normal folders (most ObjectManager-derived
containers actually) cause database bloat within undoing storages when
an object is added or removed from it.  This is because it keeps a list
of contained subobject names in an _objects attribute, which is a
tuple.  When an object is added, the tuple is rewritten in entirety.  So
for instance, if you've got 100 items in your folder, and you add one
more, you rewrite all the instance data for the folder itself, which
includes the (large) _objects tuple (and of course, any other raw
attributes, like properties).  Over time, this can be problematic.

Shane's BTreeFolder Product attempts to ameliorate this problem a bit by
keeping the data that is normally stored in the _objects tuple in its
own persistent object (a btree).

Are you breaking the content up into subfolders?  This is recommended.

I'm temped to postulate that perhaps your problem isn't as much ZCatalog
as it is ObjectManager overhead.

- C


Giovanni Maruzzelli wrote:
 
 Hello Zopistas,
 
 thank'you all for your replies.
 
 Our doubts still unresolved :-(
 
 With a clever hack that Toby Dickenson made on the very useful tranalyzer,
 we was able to see what happen
 when we add or catalog an object. (BTW, we don't use CatalogAware).
 
 We can send the output of tranalyzer2 to anyone interested, but in short
 words this is
 what happens in an empty folder (and I remind you that as the folder get
 populated, the size that
 is added to each transaction grows, a folder with one hundred objects adds
 some 100K):
 
 if we add a normal DTML document (no catalog involved) in an empty folder we
 have
 a very small increase in size: the size of the dtml and the size of the
 folder:
 
 TID: 33D853C2CE6CDBB @ 77396692 obs 2 len 729
 By ciao
 /aacucu/addDTMLDocument
 OID: 40817 len 270 [OFS.Folder.Folder]
 OID: 40818 len 309 [OFS.DTMLDocument.DTMLDocument]
 
 if we add an Articolo that's cataloged on the fly in the same empty
 directory we have a bloating:
 
 TID: 33D853D722FA167 @ 77397437 obs 96 len 226568
 By ciao
 /aacucu/Articolo_add
 OID: 40817 len 363 [OFS.Folder.Folder]
 OID: 40819 len 598 [*ennPsHQQKY5zjxlQs1ebmA==.Articolo]
 OID: 407b5 len 8074 [BTrees.IOBTree.IOBucket]
 OID: 37aa9 len 39 [BTrees.Length.Length]
 OID: 37b95 len 1483 [BTrees.OIBTree.OIBucket]
 OID: 407b7 len 1739 [BTrees.IOBTree.IOBucket]
 OID: 407b8 len 402 [BTrees.IIBTree.IISet]
 OID: 407b9 len 399 [BTrees.IOBTree.IOBucket]
 OID: 407ba len 402 [BTrees.IIBTree.IISet]
 OID: 407bb len 3497 [BTrees.IOBTree.IOBucket]
 OID: 407bc len 5871 [BTrees.OOBTree.OOBucket]
 OID: 37ab2 len 39 [BTrees.Length.Length]
 OID: 407c6 len 6279 [BTrees.IOBTree.IOBucket]
 OID: 3d7bf len 312 [BTrees.IIBTree.IISet]
 OID: 407c7 len 4507 [BTrees.IOBTree.IOBucket]
 OID: 3c992 len 837 [BTrees.OOBTree.OOBucket]
 OID: 37abe len 39 [BTrees.Length.Length]
 OID: 407d2 len 696 [BTrees.IOBTree.IOBucket]
 OID: 3cb4e len 572 [BTrees.IIBTree.IISet]
 OID: 407d3 len 537 [BTrees.IOBTree.IOBucket]
 OID: 40809 len 387 [BTrees.IIBTree.IISet]
 OID: 407d4 len 507 [BTrees.IOBTree.IOBucket]
 OID: 4080a len 387 [BTrees.IIBTree.IISet]
 OID: 407d5 len 507 [BTrees.IOBTree.IOBucket]
 OID: 4080b len 387 [BTrees.IIBTree.IISet]
 OID: 407d6 len 507 [BTrees.IOBTree.IOBucket]
 OID: 4080c len 387 [BTrees.IIBTree.IISet]
 OID: 407d7 len 339 [BTrees.IOBTree.IOBucket]
 OID: 4080d len 382 [BTrees.IIBTree.IISet]
 OID: 407d8 len 339 [BTrees.IOBTree.IOBucket]
 OID: 4080e len 382 [BTrees.IIBTree.IISet]
 OID: 407d9 len 339 [BTrees.IOBTree.IOBucket]
 OID: 3d064 len 597 [BTrees.IIBTree.IISet]
 OID: 407da len 347 [BTrees.IOBTree.IOBucket]
 OID: 4080f len 387 [BTrees.IIBTree.IISet]
 OID: 407db len 339 [BTrees.IOBTree.IOBucket]
 OID: 3d1ba len 642 [BTrees.IIBTree.IISet]
 OID: 407dc len 339 [BTrees.IOBTree.IOBucket]
 OID: 40810 len 372 [BTrees.IIBTree.IISet]
 OID: 407dd len 339 [BTrees.IOBTree.IOBucket]
 OID: 40811 len 372 [BTrees.IIBTree.IISet]
 OID: 407de len 339 [BTrees.IOBTree.IOBucket]
 OID: 37f11 len 977 [BTrees.IOBTree.IOBucket]
 OID: 380de len 830 [BTrees.OIBTree.OIBucket]
 OID: 37ac4 len 25537 [BTrees.IIBTree.IISet]
 OID: 37ac7 len 9892 [BTrees.IIBTree.IISet]
 OID: 37aca len 13947 [BTrees.IIBTree.IISet]
 OID: 38922 len 387 [BTrees.IIBTree.IISet]
 OID: 38643 len 827 [BTrees.IIBTree.IISet]
 OID: 3894c len 92 [BTrees.IIBTree.IISet]
 OID: 388ff len 24707 [BTrees.IIBTree.IISet]
 OID: 38581 len 277 [BTrees.IIBTree.IISet]
 OID: 3d7f7 len 319 [BTrees.IOBTree.IOBTree]
 OID: 3d7f8 len 356 [BTrees.IOBTree.IOBTree]
 OID: 40812 len 372 [BTrees.IIBTree.IISet]
 OID:

[Zope-dev] ObjectManager Bloat (was Re: [Zope] Re: Zcatalog bloat problem(berkeleydb is a solution?))

2001-06-26 Thread Chris Withers


Chris McDonough wrote:
 
 Shane's BTreeFolder Product attempts to ameliorate this problem a bit by
 keeping the data that is normally stored in the _objects tuple in its
 own persistent object (a btree).
 
 Are you breaking the content up into subfolders?  This is recommended.

Do you still need to do this if you're using a BTreeFolder?

cheers,

Chris

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: ObjectManager Bloat (was Re: [Zope] Re: Zcatalog bloat problem(berkeleydb is a solution?))

2001-06-26 Thread Chris McDonough


Chris Withers wrote:
 
 Chris McDonough wrote:
 
  Shane's BTreeFolder Product attempts to ameliorate this problem a bit by
  keeping the data that is normally stored in the _objects tuple in its
  own persistent object (a btree).
 
  Are you breaking the content up into subfolders?  This is recommended.
 
 Do you still need to do this if you're using a BTreeFolder?

It doesn't hurt, but likely no.  If at all, you'd want to do it so
management interface views would be sane.

Then again, I've never actually used BTreeFolder.  ;-)

- C

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: ObjectManager Bloat

2001-06-26 Thread Chris Withers


Chris McDonough wrote:
 
 It doesn't hurt, but likely no.  If at all, you'd want to do it so
 management interface views would be sane.
 
 Then again, I've never actually used BTreeFolder.  ;-)

Ah, you should, it's great :-)

The management interface is different, so you don't have problems with lots of
objects.

Both the FreeZope server have BTreeFolders storing the accounts, and they've got
between 500 and 1000 users on each server...

I think they need to be updated to use the new BTrees, but I don't know if
that's a problem... thanks to Shane, again :-)

cheers,

Chris

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Giovanni Maruzzelli


Hi Chris,

I don't think this is a problem of ObjectManager, also if it contribute to
the bloating.

We do breaks the content in subfolders, but our subfolders easily grows to
contains some hundred objects.

Do you think that the number of indexes contribute to the bloating? If this
is important, we can try to compact them in a littler number (eg: the
boolean indexes can become a sort of bitmask, eliminate the meta_type,
etc.).

This is our indexes (cut and paste from the ZMI), and following there is our
metadata :

INDEXES:
  PrincipiaSearchSource Text Index 2,524
  autore Keyword Index 4,055
  bflow0 Field Index 4,055
  bflow1 Field Index 4,055
  bflow2 Field Index 4,055
  bflow3 Field Index 4,055
  bflow4 Field Index 4,055
  bflow5 Field Index 4,055
  bflow6 Field Index 4,055
  bflow7 Field Index 4,055
  bflow8 Field Index 4,055
  bflow9 Field Index 4,055
  bobobase_modification_time Field Index 4,300
  dflow0 Field Index 4,055
  dflow1 Field Index 4,055
  id Field Index 4,300
  m_sflow0 Keyword Index 3,960
  m_sflow1 Keyword Index 3,960
  m_sflow2 Keyword Index 3,960
  meta_type Field Index 4,300
  pseudoId Text Index 4,054
  revisore Keyword Index 4,055
  title Text Index 3,844

METADATA:

  bobobase_modification_time
  id
  meta_type
  pseudoId
  title

- Original Message -
Sent: Tuesday, June 26, 2001 12:45 PM
Subject: Re: Zcatalog bloat problem (berkeleydb is a solution?)



 Hi Giovanni,

 How many indexes do you have, what are the index types, and what do they
 index?  Likewise, what about metadata?  In your last message, you said
 there's about 20.  That's a heck of a lot of indexes.  Do you need them
 all?

 I can see a potential reason for the problem you explain as and I
 remind you that as the folder get populated, the size that is added to
 each transaction grows, a folder with one hundred objects adds some
 100K... It's true that normal folders (most ObjectManager-derived
 containers actually) cause database bloat within undoing storages when
 an object is added or removed from it.  This is because it keeps a list
 of contained subobject names in an _objects attribute, which is a
 tuple.  When an object is added, the tuple is rewritten in entirety.  So
 for instance, if you've got 100 items in your folder, and you add one
 more, you rewrite all the instance data for the folder itself, which
 includes the (large) _objects tuple (and of course, any other raw
 attributes, like properties).  Over time, this can be problematic.

 Shane's BTreeFolder Product attempts to ameliorate this problem a bit by
 keeping the data that is normally stored in the _objects tuple in its
 own persistent object (a btree).

 Are you breaking the content up into subfolders?  This is recommended.

 I'm temped to postulate that perhaps your problem isn't as much ZCatalog
 as it is ObjectManager overhead.

 - C


 Giovanni Maruzzelli wrote:
 
  Hello Zopistas,
 
  thank'you all for your replies.
 
  Our doubts still unresolved :-(
 
  With a clever hack that Toby Dickenson made on the very useful
tranalyzer,
  we was able to see what happen
  when we add or catalog an object. (BTW, we don't use CatalogAware).
 
  We can send the output of tranalyzer2 to anyone interested, but in short
  words this is
  what happens in an empty folder (and I remind you that as the folder get
  populated, the size that
  is added to each transaction grows, a folder with one hundred objects
adds
  some 100K):
 
  if we add a normal DTML document (no catalog involved) in an empty
folder we
  have
  a very small increase in size: the size of the dtml and the size of the
  folder:
 
  TID: 33D853C2CE6CDBB @ 77396692 obs 2 len 729
  By ciao
  /aacucu/addDTMLDocument
  OID: 40817 len 270 [OFS.Folder.Folder]
  OID: 40818 len 309 [OFS.DTMLDocument.DTMLDocument]
 
  if we add an Articolo that's cataloged on the fly in the same empty
  directory we have a bloating:
 
  TID: 33D853D722FA167 @ 77397437 obs 96 len 226568
  By ciao
  /aacucu/Articolo_add
  OID: 40817 len 363 [OFS.Folder.Folder]
  OID: 40819 len 598 [*ennPsHQQKY5zjxlQs1ebmA==.Articolo]
  OID: 407b5 len 8074 [BTrees.IOBTree.IOBucket]
  OID: 37aa9 len 39 [BTrees.Length.Length]
  OID: 37b95 len 1483 [BTrees.OIBTree.OIBucket]
  OID: 407b7 len 1739 [BTrees.IOBTree.IOBucket]
  OID: 407b8 len 402 [BTrees.IIBTree.IISet]
  OID: 407b9 len 399 [BTrees.IOBTree.IOBucket]
  OID: 407ba len 402 [BTrees.IIBTree.IISet]
  OID: 407bb len 3497 [BTrees.IOBTree.IOBucket]
  OID: 407bc len 5871 [BTrees.OOBTree.OOBucket]
  OID: 37ab2 len 39 [BTrees.Length.Length]
  OID: 407c6 len 6279 [BTrees.IOBTree.IOBucket]
  OID: 3d7bf len 312 [BTrees.IIBTree.IISet]
  OID: 407c7 len 4507 [BTrees.IOBTree.IOBucket]
  OID: 3c992 len 837 [BTrees.OOBTree.OOBucket]
  OID: 37abe len 39 [BTrees.Length.Length]
  OID: 407d2 len 696 [BTrees.IOBTree.IOBucket]
  OID: 3cb4e len 572 [BTrees.IIBTree.IISet]
  OID: 407d3 len 537 [BTrees.IOBTree.IOBucket]
  OID: 40809 len 387 [BTrees.IIBTree.IISet]
  OID: 407d4 len 507

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Giovanni Maruzzelli


I use 2.3.3 with python 1.5.2 on freebsd 3

I'm not so picky about bloating, but adding a document of 1K adds some 400K,
and keeps growing.

How much eat for you (I know you cataloged some 50K documents)?

-giovanni
- Original Message -
Sent: Tuesday, June 26, 2001 1:48 PM
Subject: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a
solution?)


 Giovanni, which Zope version are you running?

 On Tue, 26 Jun 2001, Chris McDonough wrote:

  How many indexes do you have, what are the index types, and what do
  they index?  Likewise, what about metadata?  In your last message, you
  said there's about 20.  That's a heck of a lot of indexes.  Do you
  need them all?

 In my installation I have about 30 or 40
 Position(Text)Index/KeywordIndex/FieldIndex.  They don't bloat much, so I
 don't think that's the problem.  (The problem might be that we have
 different views on what bloating is, though :)


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough


Well,  I'm not sure, unfortunately.  I just wanted to get an idea of
what kinds of indexes you had.  The tranalyzer output doesn't mean too
much to me, because it shows BTree buckets and such getting updated,
which is completely understandable... there are at least two data
structures in the Catalog itself that use a BTree, and each index uses
at least two BTrees.  So it's not all that surprising to see that
output.  What is suprising is to hear the amount of growth a transaction
causes.  The only thing I can think of is that:

a) you're committing inappropriately (at times where it would be OK to
not commit)

b) the data fields your indexing or getting metadata from are large.

c) something awful happened between 2.3.2 and 2.3.3 that I dont
understand.

d) the problem is unrelated to the Catalog.

I'm afraid I can't be any more precise than that.

-C


Giovanni Maruzzelli wrote:
 
 Hi Chris,
 
 I don't think this is a problem of ObjectManager, also if it contribute to
 the bloating.
 
 We do breaks the content in subfolders, but our subfolders easily grows to
 contains some hundred objects.
 
 Do you think that the number of indexes contribute to the bloating? If this
 is important, we can try to compact them in a littler number (eg: the
 boolean indexes can become a sort of bitmask, eliminate the meta_type,
 etc.).
 
 This is our indexes (cut and paste from the ZMI), and following there is our
 metadata :
 
 INDEXES:
   PrincipiaSearchSource Text Index 2,524
   autore Keyword Index 4,055
   bflow0 Field Index 4,055
   bflow1 Field Index 4,055
   bflow2 Field Index 4,055
   bflow3 Field Index 4,055
   bflow4 Field Index 4,055
   bflow5 Field Index 4,055
   bflow6 Field Index 4,055
   bflow7 Field Index 4,055
   bflow8 Field Index 4,055
   bflow9 Field Index 4,055
   bobobase_modification_time Field Index 4,300
   dflow0 Field Index 4,055
   dflow1 Field Index 4,055
   id Field Index 4,300
   m_sflow0 Keyword Index 3,960
   m_sflow1 Keyword Index 3,960
   m_sflow2 Keyword Index 3,960
   meta_type Field Index 4,300
   pseudoId Text Index 4,054
   revisore Keyword Index 4,055
   title Text Index 3,844
 
 METADATA:
 
   bobobase_modification_time
   id
   meta_type
   pseudoId
   title

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread abel deuring


Hi Giovanni, Chris and all others,

Chris McDonough wrote:
 
 Hi Giovanni,
 
 How many indexes do you have, what are the index types, and what do they
 index?  Likewise, what about metadata?  In your last message, you said
 there's about 20.  That's a heck of a lot of indexes.  Do you need them
 all?
 
 I can see a potential reason for the problem you explain as and I
 remind you that as the folder get populated, the size that is added to
 each transaction grows, a folder with one hundred objects adds some
 100K... It's true that normal folders (most ObjectManager-derived
 containers actually) cause database bloat within undoing storages when
 an object is added or removed from it.  This is because it keeps a list
 of contained subobject names in an _objects attribute, which is a
 tuple.  When an object is added, the tuple is rewritten in entirety.  So
 for instance, if you've got 100 items in your folder, and you add one
 more, you rewrite all the instance data for the folder itself, which
 includes the (large) _objects tuple (and of course, any other raw
 attributes, like properties).  Over time, this can be problematic.
 
 Shane's BTreeFolder Product attempts to ameliorate this problem a bit by
 keeping the data that is normally stored in the _objects tuple in its
 own persistent object (a btree).
 
 Are you breaking the content up into subfolders?  This is recommended.
 
 I'm temped to postulate that perhaps your problem isn't as much ZCatalog
 as it is ObjectManager overhead.


Well, I'm not very familiar with the details about the sub-object
management of ObjectManager and friends. Moreover, I had yet a closer
look only into UnTextIndex, but not into UnIndex or UnKeywordIndex. So
take my comments with a grain of salt. 

A text index (class SearchIndex.UnTextIndex) is definetely is a cause of
bloating, if you use CatalogAware objects. An UnTextIndex maintains for
each word a list of documents, where this word appears. So, if a
document to be indexed contains, say, 100 words, 100 IIBTrees
(containing mappings documentId - word score) will be updated. (see
UnTextIndex.insertForwardIndexEntry) If you have a larger number of
documents, these mappings may be quite large: Assume 10.000 documents,
and assume that you have 10 words which appear in 30% of all documents.
Hence, each of the IIBTrees for these words contains 3000 entries. (Ok,
one can try to keep this number of frequent words low by using a good
stop word list, but at least for German, such a list is quite difficult
to build. And one can argue that many not too really frequent words
should be indexed in order to allow more precise phrase searches)I don't
know the details, how data is stored inside the BTress, so I can give
only a rough estimate of the memory requirements: With 32 bit integers,
we have at least 8 bytes per IIBTree entry (documentId and score), so
each of the 10 BTree for the frequent words has a minimum length of
3000*8 = 24000 bytes. 

If you now add a new document containing 5 of these frequent words, 5
larger BTrees will be updated. [Chris, let me know, if I'm now going to
tell nonsense...] I assume that the entire updated BTrees = 12 bytes
will be appended to the ZODB (ignoring the less frequent words) -- even
if the document contains only 1 kB text. 

This is the reason, why I'm working on some kind of lazy cataloging.
My approach is to use a Python class (or Base class,if ZClasses are
involved), which has a method manage_afterAdd. This method looks for
superValues of a type like lazyCatalog (derived from ZCatalog), and
inserts self.getPhysicalPath() into the update list of each found
lazyCatalog.

Later, a lazyCatalog can index all objects in this list. Then, then
bloating happens either in RAM (without subtransaction), or in a
temporary file, if you use subtransactions.

OK, another approach which fits better to your (Giovanni) needs might be
to use another data base than ZODB, but I'm afarid that even then
instant indexing will be an expensive process, if you have a large
number of documents.

Abel

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Opera seems to cause memory leak on Zope Server (Linux)

2001-06-26 Thread Christian Theune


Hi everybody.

I'm (trying) to work with Opera, because I love it's speed,
the problem is, from time to time, a request from opera,
(seems to raise on POST only) causes Zope to eat all ram
it can get and all cpu available.

I tried to check out and found following data:

it only rises on POST requests
then it rises on this requests EVERYTIME

I have two files for you, tracing the conversation of an example
post. The first one traces a post trying to create a DTML Method
with opera 5.02 Linux, the second tries the same thing, 
same urls, with Netscape 4.77.
 
-- 
Christian Theune - [EMAIL PROTECTED]
gocept gmbh  co.kg - schalaunische strasse 6 - 06366 koethen/anhalt
tel.+49 3496 3099112 - fax.+49 3496 3099118 mob. - 0178 48 33 981

reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))


== POST 
http://www.whq.gocept.com:10080/rat/file/hilfeplan/0_basis/manage_addProduct/OFSP/addDTMLMethod
 HTTP/1.0

== User-Agent: Mozilla/4.0 (compatible; MSIE 5.0; Linux) Opera 5.0  [en]

== Host: www.whq.gocept.com:10080

== Accept: text/html, image/png, image/jpeg, image/gif, image/x-xbitmap, */*

== Accept-Language: de,en

== Accept-Encoding: deflate, gzip, x-gzip, identity, *;q=0

== Referer: 
http://www.whq.gocept.com:10080/rat/file/hilfeplan/0_basis/manage_addProduct/OFSP/methodAdd

== Cookie: 
tree-s=eJyLjlZ3hAL3SgNbdR2FaCSRKld0EYNkNBEfS19b9VigEJKII5oa30CQGjgAALiSHio; 
zmi_use_css=1; zmi_top_frame=1; sql_pref__rows=20; sql_pref__cols=95; 
dtpref_rows=45; dtpref_cols=115; _ZopeId=94176612Az11pmPfmwg; 
__ac=Y3RoZXVuZTplbnVlaHRj%0a

== Cookie2: $Version=1

== Proxy-Connection: Keep-Alive

== Content-length: 403

== Content-Type: multipart/form-data;

==  boundary=_OPERAB__-DeUkBe0Y+3V7cF3y+Pn4nN

== 

== --_OPERAB__-DeUkBe0Y+3V7cF3y+Pn4nN

== Content-Disposition: form-data; name=id

== 

== test1

== --_OPERAB__-DeUkBe0Y+3V7cF3y+Pn4nN

== Content-Disposition: form-data; name=title

== 

== 

== --_OPERAB__-DeUkBe0Y+3V7cF3y+Pn4nN

== Content-Disposition: form-data; name=file:string

== 

== 

== --_OPERAB__-DeUkBe0Y+3V7cF3y+Pn4nN

== Content-Disposition: form-data; name=submit

== 

==  Add and Edit 

== --_OPERAB__-DeUkBe0Y+3V7cF3y+Pn4nN--

[0.024 - Server connected]
== HTTP/1.0 503 Service Unavailable

== Server: Squid/2.2.STABLE5

== Mime-Version: 1.0

== Date: Sat, 23 Jun 2001 15:14:31 GMT

== Content-Type: text/html

== Content-Length: 834

== Expires: Sat, 23 Jun 2001 15:14:31 GMT

== X-Squid-Error: ERR_CONNECT_FAIL 111

== X-Cache: MISS from pegasus.ct.gocept.com

== Proxy-Connection: close

== 

== HTMLHEAD
== TITLEERROR: The requested URL could not be retrieved/TITLE
== /HEADBODY
== H1ERROR/H1
== H2The requested URL could not be retrieved/H2
== HR
== P
== While trying to retrieve the URL:
== A 
HREF=http://www.whq.gocept.com:10080/rat/file/hilfeplan/0_basis/manage_addProduct/OFSP/addDTMLMethod;http://www.whq.gocept.com:10080/rat/file/hilfeplan/0_basis/manage_addProduct/OFSP/addDTMLMethod/A
== P
== The following error was encountered:
== UL
== LI
== STRONG
== Connection Failed
== /STRONG
== /UL
== 
== P
== The system returned:
== PREI(111) Connection refused/I/PRE
== 
== P
== The remote host or network may be down.  Please try the request again.
== /P
== 
== br clear=all
== hr noshade size=1
== Generated Sat, 23 Jun 2001 15:14:31 GMT by pegasus.ct.gocept.com (a 
href=http://squid.nlanr.net/Squid/;Squid/2.2.STABLE5/a)
== /BODY/HTML
[41.139000 - Closed by Server]




== POST 
http://www.whq.gocept.com:10080/rat/file/hilfeplan/0_basis/manage_addProduct/OFSP/addDTMLMethod
 HTTP/1.0

== Referer: 
http://www.whq.gocept.com:10080/rat/file/hilfeplan/0_basis/manage_addProduct/OFSP/methodAdd

== Proxy-Connection: Keep-Alive

== User-Agent: Mozilla/4.77 [en] (X11; U; Linux 2.4.5 i686)

== Host: www.whq.gocept.com:10080

== Accept: image/gif, image/x-xbitmap, image/jpeg, image/pjpeg, image/png, */*

== Accept-Encoding: gzip

== Accept-Language: en

== Accept-Charset: iso-8859-1,*,utf-8

== Cookie: tree-s=eJzTiFZ3hAIfS19b9VgdTQAujgSb; dtpref_rows=20; dtpref_cols=75; 
sql_pref__rows=20; sql_pref__cols=80; __ac=Y3RoZXVuZTplbnVlaHRj%0a

== Content-type: multipart/form-data; 
boundary=---10181086171331170261630052973

== Content-Length: 538

== 

== -10181086171331170261630052973

== Content-Disposition: form-data; name=id

== 

== test1

== -10181086171331170261630052973

== Content-Disposition: form-data; name=title

== 

== 

== -10181086171331170261630052973

== Content-Disposition: form-data; name=file:string; filename=

== 

== 

== -10181086171331170261630052973

== Content-Disposition: form-data; name=submit

== 

==  Add and Edit 

== -10181086171331170261630052973--

== HTTP/1.0 302 Moved Temporarily

== Server: Zope/Zope 2.3.2 (source release, python 1.5.2, linux2) ZServer/1.1b1

== Date: Sat, 23 Jun 2001 15:18:03 GMT

==

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Erik Enge


On Tue, 26 Jun 2001, Giovanni Maruzzelli wrote:

 I'm not so picky about bloating, but adding a document of 1K adds some
 400K, and keeps growing.
: 
 How much eat for you (I know you cataloged some 50K documents)?

I can't remember, but surely not that much.  I had some 30.000 documents
that were about 30-60Kb on average (although some were several megabytes),
in addition to around 50.000 other objects (documents, if you like)
indexed.  My Data.fs would've been around 2.5GB if my memory serves me
correctly.

As I said, I had loads of Indexes too.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Eric Roby



Chris McDonough Wrote:
 CatalogAware is arguably broken and should really not be used.

 In the meantime, if you care at all about cataloging, do not use
 CatalogAware.  Instead, manage the recataloging yourself and don't
 uncatalog a changed object before recataloging it during this manual
 operation, because this defeats all of the carefully set up change
 detection code (which may or may not still be working since I last
 worked on it ;-)

Chris,

Thank you for your candor here.  I wish this minor detail had been disclosed
in the Zope book.  Chapter 9 was my holy grail when I started down this
trail (creating these new ZClasses that would auto catalog themselves).  It
looked good in print...  I have banked a good deal of my project on this
very service and ... well it is a bit frustrating to find out that I need to
go back and re-do my work.

Along this same vein,  I would suggest that (possibly) ZClasses don't really
work, either, and should not be used.  There was a comment from another
developer (on zope-dev a month or so ago) that essentially (in his own
words) made this very claim.  At the time, I chalked it up to this Real
Zope Developers Don't Use ZCLasses kinda comment.  There certainly are
enough Zope products out there that (at least) leverage some of the ZClass
plumbing.

Another claim in the Zope book (chapter 8) says that I can leverage my 6+
years of Perl experience to create Zope scripts.  Well, I would suggest that
this doesn't really work, either...

The bottom line to all this venting (and I am not trying to shoot the
messenger here) is that I need to understand where my efforts should be
focused.  If I need to abandon ZClasses in lieu of pure Python, then I need
to know that now so I don't waste any more time on these false starts. The
Perl thing is just a matter of principle (I think Perl's implementation of
OO stinks).  The way it is presented in the book, I would expect it to be a
core Zope thing and not some appendage that requires a particular compiler
and Andy sitting next to you.

I don't intend to abandon Zope, I just need a reality check...

Eric

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough


Eric Roby wrote:
 
 Chris McDonough Wrote:
  CatalogAware is arguably broken and should really not be used.
 
  In the meantime, if you care at all about cataloging, do not use
  CatalogAware.  Instead, manage the recataloging yourself and don't
  uncatalog a changed object before recataloging it during this manual
  operation, because this defeats all of the carefully set up change
  detection code (which may or may not still be working since I last
  worked on it ;-)
 
 Chris,
 
 Thank you for your candor here.  I wish this minor detail had been disclosed
 in the Zope book.  Chapter 9 was my holy grail when I started down this
 trail (creating these new ZClasses that would auto catalog themselves).  It
 looked good in print...  I have banked a good deal of my project on this
 very service and ... well it is a bit frustrating to find out that I need to
 go back and re-do my work.

Well.. actually, it's pretty simple to change CatalogAware to work
better for you.  
With a little thought, CatalogAware could be hacked at your end to be
sane for your application.  You needn't rewrite all your code.  It's
just hard for DC to release a perfect CatalogAware that works better and
is completely backwards-compatible.  It's much harder to change it to
work perfectly for everybody (which is our job here ;-) than to change
it to work perfectly for a particular application.

Basically, change the reindex_object method to:

  self.index_object()

Instead of:

  self.unindex_object()
  self.index_object()

That makes CatalogAware much saner and will produce less bloat. 
Actually, maybe I should just go make that change in the trunk and the
2.4 branch, although I'm a little afraid of what (if anything) it will
break for everybody.  To be honest, I really don't have much time to
spend thinking about this, and my fears are probably just FUD.

 Along this same vein,  I would suggest that (possibly) ZClasses don't really
 work, either, and should not be used.  There was a comment from another
 developer (on zope-dev a month or so ago) that essentially (in his own
 words) made this very claim.  At the time, I chalked it up to this Real
 Zope Developers Don't Use ZCLasses kinda comment.  There certainly are
 enough Zope products out there that (at least) leverage some of the ZClass
 plumbing.

Well, I dont use ZClasses much.  But that's because I like to use Emacs.
 
 Another claim in the Zope book (chapter 8) says that I can leverage my 6+
 years of Perl experience to create Zope scripts.  Well, I would suggest that
 this doesn't really work, either...

Not sure what you mean by doesnt work, but I assume you've had an
unpleasant experience with zope-perl?
 
 The bottom line to all this venting (and I am not trying to shoot the
 messenger here) is that I need to understand where my efforts should be
 focused.  If I need to abandon ZClasses in lieu of pure Python, then I need
 to know that now so I don't waste any more time on these false starts. The

I'll go out on a limb here.  You should learn how to write Python
Products if you're serious about creating reusable Zope applications. 
There.

 Perl thing is just a matter of principle (I think Perl's implementation of
 OO stinks).  The way it is presented in the book, I would expect it to be a
 core Zope thing and not some appendage that requires a particular compiler
 and Andy sitting next to you.

I've sort of enjoyed myself on all the times when Andy has been sitting
near me, but I understand.  ;-) Jim had a bad experience installing
zope-perl lately.  I wish I could help.  Strangely, myself, I had few
problems getting it installed and working fine.  Maybe I'm just lucky. 
I actually think zope-perl is sort of an engineering marvel myself.
 
 I don't intend to abandon Zope, I just need a reality check...

HTH,

- C

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread Barry A. Warsaw



 JA == Jerome Alet [EMAIL PROTECTED] writes:

JA For Zope it's not sure, but for Python, as well as for all
JA what people usually call open source languages, the license
JA of choice should be the GPL, or at least the LGPL, in order
JA for the language in question to not become bastardized by some
JA powerful entity.

I think I'm accurately channeling Guido when I say that Python will
never be GPL'd.  AFAIK, there is no GPL code even in the standard
Python distribution.  Both of those states of affair are by conscious
decision: regardless of what you think of the GPL (and I personally
happen to believe it can be a good license for /some/ software, but
not all) GPL'ing Python would be a very bad thing.  Guido has always
intended for people to do whatever they want with Python, including
using it in everything from closed source, proprietary, big-$$$
software to completely free software.  That's been a key to Python's
success, IMO.  I don't think anybody's really concerned that forking
and bastardizing is a real threat.  Heck, if you include
Jython/JPython, .NET Python, Vyper, and Stackless there are already
forks of Python out in the world getting real use.  (C)Python's
success hasn't suffered one bit, in fact, it's probably /benefitted/
from them.

Cheers,
-Barry

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope] Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough


Off the top of my head, I don't think there are any.  But this is why I
haven't fixed it yet, because I'd need to think about it past off the
top of my head.  ;-)

- C


Casey Duncan wrote:
 
 
 What if any disadvantages are there to not calling unindex_object first?
 If there aren't any good ones, I think I'll be rewriting some of my own
 CatalogAware code...
 --
 | Casey Duncan
 | Kaivo, Inc.
 | [EMAIL PROTECTED]
 `--
 
 ___
 Zope maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope
 **   No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope-dev )

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Erik Enge


On Tue, 26 Jun 2001, Eric Roby wrote:

 The bottom line to all this venting (and I am not trying to shoot the
 messenger here) is that I need to understand where my efforts should
 be focused.  If I need to abandon ZClasses in lieu of pure Python,
 then I need to know that now so I don't waste any more time on these
 false starts.

If your application can't be written in five minutes and you expect to use
it more than once, you shouldn't use ZClasses - IMO.  The only argument
for ZClasses (that I had at the time) was that it was very easy and fast
to set up a couple of classes and some instances.  After I wrote mk-zprod,
making Python Products is even faster than ZClasses, and certainly scales
better.

If you ask me, it would be better to streamline the Zope API a bit and
focus the effort on making it easier to start developing Python Products
at first go, instead of stopping by ZClasses.  I can't see the rationale
for ZClasses, but I'm sure there is one.  Right?

I seem to recall some fuzz about Python Products starting be alive in
the Zope instance (ie. behaving much like ZClasses) in a future release.  
I don't know if that's a good thing or not, but if it means ditching
ZClasses I'm all for it.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Toby Dickenson


INDEXES:
  PrincipiaSearchSource Text Index 2,524
  autore Keyword Index 4,055
  bflow0 Field Index 4,055
  bflow1 Field Index 4,055
  bflow2 Field Index 4,055


Aha! a clue.

If that is the output of the 'Indexes' tab then I dont think you are
using the newest ZCatalog. A recent release (im not surwe which,
2.3.2?) has a new BTree implementation that reduces bloat by modifying
fewer buckets (it also doesnt have the column showing index size)

Toby Dickenson
[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Giovanni Maruzzelli


I'm sorry to say that Toby is right in pointing at the version from which I
cutted and pasted the following, but we are using also a newer version and
the problem is the same.

We're working out our way with the dump the first bytes of the raw dump of
the new, magnificent tranalyzer from Toby (it reallly ought to be a standard
tool in the Zope distro), and we have now some hints of what happen when you
catalog something.

So, we are starting to optimize indexes and metadata, but the problem seems
not to fade away.

-giovanni


- Original Message -
From: Toby Dickenson [EMAIL PROTECTED]
To: Giovanni Maruzzelli [EMAIL PROTECTED]
Cc: Chris McDonough [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Tuesday, June 26, 2001 5:49 PM
Subject: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a
solution?)


 INDEXES:
   PrincipiaSearchSource Text Index 2,524
   autore Keyword Index 4,055
   bflow0 Field Index 4,055
   bflow1 Field Index 4,055
   bflow2 Field Index 4,055


 Aha! a clue.

 If that is the output of the 'Indexes' tab then I dont think you are
 using the newest ZCatalog. A recent release (im not surwe which,
 2.3.2?) has a new BTree implementation that reduces bloat by modifying
 fewer buckets (it also doesnt have the column showing index size)

 Toby Dickenson
 [EMAIL PROTECTED]


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris Withers


Toby Dickenson wrote:
 
 INDEXES:
   PrincipiaSearchSource Text Index 2,524
   autore Keyword Index 4,055
   bflow0 Field Index 4,055
   bflow1 Field Index 4,055
   bflow2 Field Index 4,055
 
 Aha! a clue.
 
 If that is the output of the 'Indexes' tab then I dont think you are
 using the newest ZCatalog. A recent release (im not surwe which,
 2.3.2?) has a new BTree implementation that reduces bloat by modifying
 fewer buckets (it also doesnt have the column showing index size)

Has the person concerned run the catalog update tool when they upgraded their
Zope version?

cheers,

Chris

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] ZPatterns DATA LOSS BUG and FIX (0.4.3p1 patch release)

2001-06-26 Thread Phillip J. Eby


ZPatterns 0.4.3final contains a serious bug which deletes all ZODB-stored
contents of a Rack when you use the manage_pack method.  This bug only
affects you if you store Rack-mounted objects or attributes in the ZODB,
and does not affect you if your objects are entirely contained in an RDBMS
or other mechanism external to the rack.  If you've upgraded to 0.4.3,
please do not use manage_pack!

I have uploaded new versions of ZPatterns and PlugIns to fix the problem
with packing racks, and to provide experimental support for Zope 2.4b1.
(Specifically the problems reported below with expr_globals and ts_regex
have been fixed.)  Please upgrade to 0.4.3p1 before using manage_pack, or
if you wish to experiment with ZPatterns on Zope 2.4b1.  Thanks.

Thanks too, to Juan Palomar and Itai Tavor for providing information and
patches.


At 11:01 AM 6/26/01 +0200, Juan David Ibáñez Palomar wrote:


With Zope 2.4b1 and ZPatterns 0.4.3, the following error raises when
starting Zope:

2001-06-26T08:24:40 ERROR(200) Zope Could not import Products.ZPatterns
Traceback (innermost last):
  File
/home/jdavid/Zope-2.4.0b1/lib/python/Products/ZPatterns/Expressions.py,
line 38, in ?
(Object: ComputedAttribute)
ImportError: cannot import name expr_globals

expr_globals is not available in Zope 2.4, the fix is:

...

Besides this error, there're some warnings:

/home/jdavid/Zope-2.4.0b1/lib/python/ts_regex.py:87: DeprecationWarning:
the regex module is deprecated; please use the re module
  import regex, regsub #, Sync
/home/jdavid/Zope-2.4.0b1/lib/python2.1/regsub.py:15: DeprecationWarning:
the regsub module is deprecated; please use re.sub()
  DeprecationWarning)


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Giovanni Maruzzelli


The catalog is a pristine 2.3.3b1 catalog.

We have recreated the catalog from scratch because we tried
manage_convertBTrees , but it don't work for us, it return with an error
(and the same happens with 2.3.3 final):

Error Type: TypeError
Error Value: second argument must be a class


Traceback (innermost last):
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/ZPublisher/Publish.py, line
223, in publish_module
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/ZPublisher/Publish.py, line
187, in publish
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/Zope/__init__.py, line 221,
in zpublisher_exception_hook
(Object: Traversable)
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/ZPublisher/Publish.py, line
171, in publish
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/ZPublisher/mapply.py, line
160, in mapply
(Object: manage_convertBTrees)
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/ZPublisher/Publish.py, line
112, in call_object
(Object: manage_convertBTrees)
  File
/fs1root/zope/Zope-2.3.3b1-src/lib/python/Products/ZCatalog/ZCatalog.py,
line 736, in manage_convertBTrees
(Object: Traversable)
  File
/fs1root/zope/Zope-2.3.3b1-src/lib/python/Products/ZCatalog/Catalog.py, line
204, in _convertBTrees
  File /fs1root/zope/Zope-2.3.3b1-src/lib/python/SearchIndex/UnTextIndex.py,
line 211, in _convertBTrees
TypeError: (see above)



- Original Message -
From: Chris Withers [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: Giovanni Maruzzelli [EMAIL PROTECTED]; Chris McDonough
[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Tuesday, June 26, 2001 5:59 PM
Subject: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a
solution?)


 Toby Dickenson wrote:
 
  INDEXES:
PrincipiaSearchSource Text Index 2,524
autore Keyword Index 4,055
bflow0 Field Index 4,055
bflow1 Field Index 4,055
bflow2 Field Index 4,055
 
  Aha! a clue.
 
  If that is the output of the 'Indexes' tab then I dont think you are
  using the newest ZCatalog. A recent release (im not surwe which,
  2.3.2?) has a new BTree implementation that reduces bloat by modifying
  fewer buckets (it also doesnt have the column showing index size)

 Has the person concerned run the catalog update tool when they upgraded
their
 Zope version?

 cheers,

 Chris


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Giovanni Maruzzelli


We think that Abel is absolutely right:

if in the same almost empty folder we add and catalog an object with one
word (and now we have optimized and reduced the number of indexes to 11) it
make a transaction of 73K, while if the object contains 300 words with the
same other indexes or properties, the transaction is 224K, and if all is the
same but the object contains 535 words the transaction is 331K.

And we are using now a catalog with only some 3000 document indexed with a
medium lenght of each document around 1K.

-giovanni

 Well, I'm not very familiar with the details about the sub-object
 management of ObjectManager and friends. Moreover, I had yet a closer
 look only into UnTextIndex, but not into UnIndex or UnKeywordIndex. So
 take my comments with a grain of salt.

 A text index (class SearchIndex.UnTextIndex) is definetely is a cause of
 bloating, if you use CatalogAware objects. An UnTextIndex maintains for
 each word a list of documents, where this word appears. So, if a
 document to be indexed contains, say, 100 words, 100 IIBTrees
 (containing mappings documentId - word score) will be updated. (see
 UnTextIndex.insertForwardIndexEntry) If you have a larger number of
 documents, these mappings may be quite large: Assume 10.000 documents,
 and assume that you have 10 words which appear in 30% of all documents.
 Hence, each of the IIBTrees for these words contains 3000 entries. (Ok,
 one can try to keep this number of frequent words low by using a good
 stop word list, but at least for German, such a list is quite difficult
 to build. And one can argue that many not too really frequent words
 should be indexed in order to allow more precise phrase searches)I don't
 know the details, how data is stored inside the BTress, so I can give
 only a rough estimate of the memory requirements: With 32 bit integers,
 we have at least 8 bytes per IIBTree entry (documentId and score), so
 each of the 10 BTree for the frequent words has a minimum length of
 3000*8 = 24000 bytes.

 If you now add a new document containing 5 of these frequent words, 5
 larger BTrees will be updated. [Chris, let me know, if I'm now going to
 tell nonsense...] I assume that the entire updated BTrees = 12 bytes
 will be appended to the ZODB (ignoring the less frequent words) -- even
 if the document contains only 1 kB text.

 This is the reason, why I'm working on some kind of lazy cataloging.
 My approach is to use a Python class (or Base class,if ZClasses are
 involved), which has a method manage_afterAdd. This method looks for
 superValues of a type like lazyCatalog (derived from ZCatalog), and
 inserts self.getPhysicalPath() into the update list of each found
 lazyCatalog.

 Later, a lazyCatalog can index all objects in this list. Then, then
 bloating happens either in RAM (without subtransaction), or in a
 temporary file, if you use subtransactions.

 OK, another approach which fits better to your (Giovanni) needs might be
 to use another data base than ZODB, but I'm afarid that even then
 instant indexing will be an expensive process, if you have a large
 number of documents.

 Abel


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: [Zope] CatalogAware

2001-06-26 Thread Chris McDonough


Excellent, thanks so much Toby.  Maybe some feedback will come in...

- C


Toby Dickenson wrote:
 
 Chris McDonough [EMAIL PROTECTED] wrote:
 
 I actually think this about sums it up.  If you have time to look at it
 Toby, it would be much appreciated.  I don't think it's a very
 complicated set of fixes, its just not on the radar at the moment, and
 might require some thought about backwards-compatibility.
 
 Not a patch, but Ive fixed all three known CatalogAware problems in a
 separate product; a new base class that derives from CatalogAware:
 
 http://www.zope.org/Members/htrd/BetterCatalogAware/
 
 The techniques used in this product have been thoroughly stressed in
 several other production systems, but this is the first time they have
 been collected together in one place so bugs are possible.
 
 That makes CatalogAware much saner and will produce less bloat.
 Actually, maybe I should just go make that change in the trunk and the
 2.4 branch, although I'm a little afraid of what (if anything) it will
 break for everybody.  To be honest, I really don't have much time to
 spend thinking about this, and my fears are probably just FUD.
 
 Im not sure how many people are using CatalogAware; I think many
 serious users have been scared off by the problem reports in the list
 archives.
 
 IMO fixing this may be worth a little breakage.
 
 Toby Dickenson
 [EMAIL PROTECTED]
 
 ___
 Zope maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope
 **   No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope-dev )

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: ZPL and GPL

2001-06-26 Thread Fred Wilson Horch


Barry A. Warsaw writes:

 I think I'm accurately channeling Guido when I say that Python will
 never be GPL'd.  AFAIK, there is no GPL code even in the standard
 Python distribution.  Both of those states of affair are by conscious
 decision: regardless of what you think of the GPL (and I personally
 happen to believe it can be a good license for /some/ software, but
 not all) GPL'ing Python would be a very bad thing.  Guido has always
 intended for people to do whatever they want with Python, including
 using it in everything from closed source, proprietary, big-$$$
 software to completely free software.  That's been a key to Python's
 success, IMO.

I respectfully disagree about the last point.

But it would be nice to hear what Guido thinks, and what Digital
Creation thinks.

Knowing that the copyright holders have made a conscious decision not to
allow developers to obtain Python and Zope under the terms of the GPL in
the belief that this allows people to do whatever they want with it
does help us evaluate the long-term prospects for these systems in the
marketplace.

I would love to take this discussion to a different forum.  Can someone
post the name of the zope licensing list so I don't waste non-lawyers'
time with an analysis of the correlation between licensing schemes and
success of various open source projects (a subject we intellectual
property attorneys find extremely fascinating!)?

Some compelling case studies are Linux, gcc, apache, perl and ruby (see,
in particular, Ruby's choice of licensing provision at
http://www.ruby-lang.org/en/LICENSE.txt -- you can get ruby either under
the GPL or under a home-grown Ruby license -- your choice).

Python is a great language, but it's not the only game in town!

It's very hard for me to see how offering developers the choice to
obtain Python both under the GPL and under a non-free software license
that permits proprietary extensions would harm Python's success.  (But
I'm just an attorney. ;-)

I hope this discussion is interesting and useful.  To people who have
better things to think about: I apologize for taken up time with this,
but I hope one day a discussion of the licensing uncertainties
surrounding Python and Zope will no longer be necessary.

Best regards,
Fred
-- 
Fred Wilson Horch   mailto:[EMAIL PROTECTED]
Executive Director, EcoAccess   http://ecoaccess.org/
P.O. Box 2823, Durham, NC 27715-2823phone: 919.419-8567

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalogbloat problem (berkeleydb is a solution?))

2001-06-26 Thread Erik Enge


On Tue, 26 Jun 2001, Morten W. Petersen wrote:

 How about meta-programming (designing) via the Zope interface, with
 UML or somesuch; automatically generating Python code, then enable
 designers to use a ZFormulator-ish product to edit the interface while
 a programmer can work on the 'backend' (emacs on a terminal)?

What are you on my friend?  ;-)

How about writing the whole shebang in put in favourite editor here, as
is done today?  I don't see the need for any other change, but I do know
that DC (or atleast I suspect this is why they made ZClasses) hopes to
bring Zope-programming to a wider audience with ZClasses, and thus it
might have a purpose in life.

I don't know.  Python is a very easy language to learn.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?))

2001-06-26 Thread Andy McKay


One thing Id been musing about for a while was a ZClass  Python Product
script that took your ZClass and set up your basic python product for you.
It would only work for simple for things like permissions, properties, basic
methods... Then ZClasses could be an easier springboard into python products
for those new to them.

Cheers.
--
  Andy McKay.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Hey Chris, question for you

2001-06-26 Thread Casey Duncan


Chris:

I am working on getting a decent query language for ZCatalog/Catalog and
I have been able to make good progress, however I am running into a bit
of an issue that I thought you might know something about:

In order to implement a != query operator, I am trying to do the
following:

From the index, return the result set that match the value (easy)
Subtract that from the set of all items in the index (not so easy)

I see that there is the difference method available from IIBTree,
however I seem to be unable to use it on the entire index (Which is an
OOBTree and not really a set I guess). Here is a snippit of my code
which doesn't work:

if op == '!=' or op[:3] == 'not':
w, rs = difference(index._index, rs) # XXX Not a warm fuzzy...

(where rs is the index result set that matches the value and index is
the Catalog index OOBTree)

What can I supply for the first argument to get a set of all items in
the index, or is there any easier and better approach to this whole
issue?

BTW: I realize I could step though _index.items() and create an IISet
but that seems awful inefficient...

Thanks in advance for any ideas you might have...

-- 
| Casey Duncan
| Kaivo, Inc.
| [EMAIL PROTECTED]
`--

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalogbloat problem (berkeleydb is a solution?))

2001-06-26 Thread Morten W. Petersen


On Tue, 26 Jun 2001, Erik Enge wrote:

 On Tue, 26 Jun 2001, Morten W. Petersen wrote:
 
  How about meta-programming (designing) via the Zope interface, with
  UML or somesuch; automatically generating Python code, then enable
  designers to use a ZFormulator-ish product to edit the interface while
  a programmer can work on the 'backend' (emacs on a terminal)?
 
 What are you on my friend?  ;-)

Well, it's quite logical: UML can be used to map out both software and
business development (they are, after all, two sides of the same story),
the designer can twiddle-n-polish the interface and the programmer can
take care of 'exceptional tasks' that can't easily be taken care of via
the UML interface without adding too much complexity.

-Morten


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalogbloat problem (berkeleydb is a solution?))

2001-06-26 Thread Morten W. Petersen


On Tue, 26 Jun 2001, Morten W. Petersen wrote:

 Well, it's quite logical: UML can be used to map out both software and
 business development (they are, after all, two sides of the same story),
 the designer can twiddle-n-polish the interface and the programmer can
 take care of 'exceptional tasks' that can't easily be taken care of via
 the UML interface without adding too much complexity.

The UML interface may be a bit far fetched, but that's because nobody has
done it yet.  ;-)

-Morten




___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Using db_connections from Zope products

2001-06-26 Thread Kent Polk


On 20 Jun 2001 11:25:01 -0500, Tom Brown wrote:
 I would like to make an SQL query directly from python
 code.  Do I have to make a 
 ZSQL Method dynamically, or is there another way
 without making the class database 
 dependent (i.e. Gadfly, PoPy, etc), utilizing an
 existing db_con?  Suppose I am using the ZPoPy DA and 
 have established a database connection externally. 
 How can I access this database 
 and submit a query from my own class?

I have a product which queries a specified database table from the
Product Class and both executes queries entirely in Python, and
builds ZSQL methods - populating them with std queries that can
then be customized.

You can directly use the database connection_id, but you will
probably want to insert :
import pdb; pdb.set_trace()
in your class and use the debugger as you go.

(I'll throw some probably bad code at you here - caveat emptor)

To start out, somewhere you will want to be able to specify the
connection id to use, such as:

TR
  TD ALIGN=LEFT VALIGN=TOP
  EMSTRONGConnection Id/STRONG/EM
  /TD
  TD ALIGN=LEFT VALIGN=TOP
  SELECT NAME=connection_id
dtml-in SQLConnectionIDs
  OPTION VALUE=dtml-sequence-item;
  dtml-var sequence-key/OPTION
/dtml-in
  /SELECT
  /TD
/TR

and in your class, somewhere, something like this:

if REQUEST and REQUEST.has_key('connection_id'):
self.connection_id = REQUEST['connection_id']

To create the zsql methods, I first have to obtain some info:

def dbquery_handle(obj, connection_id):
Find and return the Zope database connector query method 

database_type = ''

# Locate the Database Connector
try:
dbc=getattr(obj, connection_id)
database_type = dbc.database_type
except AttributeError:
raise AttributeError, (
The database connection em%s/em cannot be found. %
(connection_id))

# Prepare the Database Connector for a query
try: DB__=dbc()
except: raise 'Database Error', (
'%s is not connected to a database' % connection_id)

# Return the query method
return database_type, DB__.query

# There's got to be a more universal way to do this, but I don't
# know what it is
def tableexists(dbtype, dbq, tablename):
Query the database to see if the table exists
table_exists = []

if dbtype == 'MySQL':
try:
table_show_query = 'SHOW TABLES LIKE %s'
meta, table_exists = dbq(table_show_query % tablename, 1)
return table_exists
except :
pass

elif dbtype == 'Sybase':
try:
table_show_query = SELECT name FROM sysobjects \
WHERE id = object_id('%s')
meta, table_exists = dbq(table_show_query % tablename, 1)
return table_exists
except:
pass

return table_exists

Now..

# ZSQL Method creation
def create_zsqlmethods(self, id, connection_id, properties, maketable=0):
Create a series of Zope SQLMethods for this table 

schema = []
tableschema = []
dbtype, dbquery = dbquery_handle(self, connection_id)
table_exists = tableexists(dbtype, dbquery, id)

... determine table schema from whatever, and instantiate the 'create'
SQL method:

create = SQL('createTable', title='',
connection_id=connection_id, arguments='',
template=table_create_query %(id, vars))

self._setObject('createTable', create)
...

and even create the SQL table if you need to:

if not table_exists and maketable:
self.createTable()

etc. etc.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Chris McDonough


 Chris:

 I am working on getting a decent query language for ZCatalog/Catalog and

Very cool...

 I have been able to make good progress, however I am running into a bit
 of an issue that I thought you might know something about:

 In order to implement a != query operator, I am trying to do the
 following:

Tricky.

 From the index, return the result set that match the value (easy)
 Subtract that from the set of all items in the index (not so easy)

 I see that there is the difference method available from IIBTree,
 however I seem to be unable to use it on the entire index (Which is an
 OOBTree and not really a set I guess). Here is a snippit of my code
 which doesn't work:

 if op == '!=' or op[:3] == 'not':
 w, rs = difference(index._index, rs) # XXX Not a warm fuzzy...

 (where rs is the index result set that matches the value and index is
 the Catalog index OOBTree)

 What can I supply for the first argument to get a set of all items in
 the index, or is there any easier and better approach to this whole
 issue?

Well.. I assume that _index is the forward data structure of a FieldIndex.
In this case, you could get the info you want (a list of all document ids in
the index) from _unindex.keys(), as _index and _unindex are mirror images of
each other that need to be kept in sync... I think what comes back is a
BTreeItems object.  I think this is usable in conjunction with the resultset
IISet (also a list of document ids) via the difference function... I haven't
tried it, though...

 BTW: I realize I could step though _index.items() and create an IISet
 but that seems awful inefficient...

Yeah, that'd be terrible.

This is a tricky operator.  I can't really wrap my head around using it in
conjunction with parens.  Then again, maybe you wouldn't...


HTH,

- C



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread abel deuring


Hi all,

Giovanni Maruzzelli wrote:
 
 We think that Abel is absolutely right:
 
 if in the same almost empty folder we add and catalog an object with one
 word (and now we have optimized and reduced the number of indexes to 11) it
 make a transaction of 73K, while if the object contains 300 words with the
 same other indexes or properties, the transaction is 224K, and if all is the
 same but the object contains 535 words the transaction is 331K.
 
 And we are using now a catalog with only some 3000 document indexed with a
 medium lenght of each document around 1K.

Well, Chris certainly knows more about the internals of ZCatalog than I
do, so we should not ignore his comments to my mail :)

Chris McDonough wrote:

  If you now add a new document containing 5 of these frequent words, 5
  larger BTrees will be updated. [Chris, let me know, if I'm now going to
  tell nonsense...] I assume that the entire updated BTrees = 12 bytes
  will be appended to the ZODB (ignoring the less frequent words) -- even
  if the document contains only 1 kB text.
 
 Nah... I don't think so.  At least I hope not!  Each bucket in a BTree
 is a separate persistent object.  So only the sum of the data in the
 updated buckets will be appended to the ZODB.  So if you add an item to
 a BTree, you don't add 24000+ bytes for each update.  You just add the
 amount of space taken up by the bucket... unfortunately I don't know
 exactly how much this is, but I'd imagine it's pretty close to the
 datasize with only a little overhead.

OK, this made me curious, so I made test similar to the one by Giovanni.
I started with a ZCatalog containing 21616 records; the catalog contains
only one text index, no keyword index, no field index. I copied one of
the indexed documents; the text is 2645 bytes long; wc tells me that it
has 313 words. Next, I packed the data base in order to have a clean
start point. After packing, Data.fs has a size of 233661963 byte.

Then I cataloged the new object using my lazy catalog. Since I have
only one new document, this is basically the same as using
CatalogAwareness. After indexing, the data base has grown to 233851090
bytes -- an increase of 189127 bytes. Then I packed the data base again,
resulting in a size of 233666237 bytes.

So the net increase is indeed 233666237-233661963 = 4274 bytes, as you
expected, but obviously a few more data base records need to be updated.

Abel

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?))

2001-06-26 Thread Stephan Richter


Hello everyone,

I gotta join this discussion.

iuveno was also thinking about a tool that would replace ZClasses, since 
their performance is far too bad. We had a not so good experience with the 
ZClass-based Kontentor and now that the first part is rewritten in Python 
we can see the speed-ups (the performance increase can be measured in 
multiples - real tests need to be done).
The reason - in my opinion - that ZClasses are so slow are the huge amount 
of Acquistion lookups and the save rendering. You can often code things 
smarter in Python using much less of the safe Zope environment, but still 
providing safety through specific commands. I think Formulator is a great 
example of how safe the Python programming can be.
There are two thoughts here:

1. We are building a wizard that asks you all the necessary questions to 
generate a basic class framwework. This wizard (which can be used in many 
other fields - such as installers - as well) is currently being built. We 
use Formulator a lot and I support the development of it as much as I can 
(it is a cool product with many cool features). If anyone is interested in 
helping developing that tool (which will be released under the GPL as all 
of the iuveno products), then I can make an electronic copy of my personal 
notes and I setup a CVS.
Formulator at Zope: http://www.zope.org/Members/faassen/Formulator
Formulator at Sourceforge: http://sourceforge.net/projects/formulator/


2. Phillip Auersperg from bluedynamics.com uses ObjectDomain quiet heavily, 
since it has a nice JPython API that comes with it. He already built a 
reverse-engineering tool for ZClasses and is now going to write another 
tool to automatically generate DBObjects from a UML diagram in 
ObjectDomain. I am very excited about this tool, since it will make the 
already fast DBObject development even faster. As soon as Formulator goes 
into 1.0, I am going to think about binding the Formulator to DBObjects, so 
you can quickly generate forms for each object. In Berlin in two weeks, we 
are going to discuss this integration in more detail...
Bluedynamics URL: http://www.bluedynamics.com
DBObjects/SmartObjects URL: http://demo.iuveno-net.de


I am very glad to see that we all have the same vision. We all just need to 
work more together (especially me included). This way we can have some 
strikes against the big ones...

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development  Technical Project Management


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough


Yikes.  I wonder if this overhead comes from Vocabulary updates... thanks
very much for doing this test.

Clearly we need to pin it down.  This is very disappointing.  :-(  Any
further info you dig up is appreciated.

You didn't have any metadata stuff set up, did you?  I imagine even if you
did, that they couldn't possibly account for 200K worth of extra stuff.

- C

- Original Message -
From: abel deuring [EMAIL PROTECTED]
To: Giovanni Maruzzelli [EMAIL PROTECTED]
Cc: Chris McDonough [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]; [EMAIL PROTECTED]
Sent: Tuesday, June 26, 2001 2:40 PM
Subject: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a
solution?)


 Hi all,

 Giovanni Maruzzelli wrote:
 
  We think that Abel is absolutely right:
 
  if in the same almost empty folder we add and catalog an object with one
  word (and now we have optimized and reduced the number of indexes to 11)
it
  make a transaction of 73K, while if the object contains 300 words with
the
  same other indexes or properties, the transaction is 224K, and if all is
the
  same but the object contains 535 words the transaction is 331K.
 
  And we are using now a catalog with only some 3000 document indexed with
a
  medium lenght of each document around 1K.

 Well, Chris certainly knows more about the internals of ZCatalog than I
 do, so we should not ignore his comments to my mail :)

 Chris McDonough wrote:

   If you now add a new document containing 5 of these frequent words, 5
   larger BTrees will be updated. [Chris, let me know, if I'm now going
to
   tell nonsense...] I assume that the entire updated BTrees = 12
bytes
   will be appended to the ZODB (ignoring the less frequent words) --
even
   if the document contains only 1 kB text.
 
  Nah... I don't think so.  At least I hope not!  Each bucket in a BTree
  is a separate persistent object.  So only the sum of the data in the
  updated buckets will be appended to the ZODB.  So if you add an item to
  a BTree, you don't add 24000+ bytes for each update.  You just add the
  amount of space taken up by the bucket... unfortunately I don't know
  exactly how much this is, but I'd imagine it's pretty close to the
  datasize with only a little overhead.

 OK, this made me curious, so I made test similar to the one by Giovanni.
 I started with a ZCatalog containing 21616 records; the catalog contains
 only one text index, no keyword index, no field index. I copied one of
 the indexed documents; the text is 2645 bytes long; wc tells me that it
 has 313 words. Next, I packed the data base in order to have a clean
 start point. After packing, Data.fs has a size of 233661963 byte.

 Then I cataloged the new object using my lazy catalog. Since I have
 only one new document, this is basically the same as using
 CatalogAwareness. After indexing, the data base has grown to 233851090
 bytes -- an increase of 189127 bytes. Then I packed the data base again,
 resulting in a size of 233666237 bytes.

 So the net increase is indeed 233666237-233661963 = 4274 bytes, as you
 expected, but obviously a few more data base records need to be updated.

 Abel

 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread R.


On 26 Jun 2001 00:29:05 +0200, Erik Enge wrote:
 On 25 Jun 2001, Michael R. Bernstein wrote:
 
  Other than keeping the door open for this eventuality, is there any
  other reason to choose a BSD style license over the GPL?
 
 Yes.  A commercial one; an imperative one.  If I make a Zope Python
 Product, I must license it as GPL to be able to redistribute.  That's
just
 unacceptable in my eyes.

Umm. Yes, you're right. The compatibility needs to go both ways as far
as Products are concerned. The Zope license should allow GPL'd Products,
as well as proprietary ones..

  Unless I've misunderstood something (which is certainly possible),
DC
  doesn't seem to have anything to lose by switching from a BSD style
  license to the GPL (or a GPL style license with an additional
optional
  attribution clause), and quite a bit to gain.
 
 How do you suppose DC make their monies?  I'm quite sure they can't
 license Zope under the GPL because they would intimidate their market
too
 much with it (an assumption that could be wrong, naturally).

DC has been up-fron about how they make money. They do so by selling
development services using Zope as a toolkit/platform.

 Let's hope they go for a GPL-compatible one.  I can't see what they
 would/could loose by using a BSD-style one, maybe you have some
thoughts
 on that?

Well, I guess the issue is whether you think that redistribution of a
proprietary version of Zope itself is a good or bad thing. BSD style
licenses permit proprietary free-riders. Contributing anything back to
the open-source version is not required (although companies can still
choose to do so).

As DC is the copyright holder, they have the ability to do this with
their work regardless of what license they choose, since they can always
relicense or dual-license. But I have a problem allowing other players
the same privilege.

As a possible scenario, let's suppose that someone wanted to create a
content mangement solution for the southeast asian market. They go to a
lot of trouble to internationalize Zope so it can handle CJK character
sets, and translate the management interfaces. then they distribute the
entire thing as a proprietary, binary-only, retail software package, and
don't contribute back to the existing community i8ln effort. While they
would be saddled with maintaining their proprietary fork thereafter,
they still reap a huge initial windfall. They can also continue to
incorporate improvements from the community with no repurcussions.

Now, far be it from me to say that companies that make improvements to
Zope are not entitled to a return on their investment, but I think that
the example I've given here is one of a disproportionate reward.

Michael Bernstein.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread R.


On 26 Jun 2001 09:30:49 +1000, Richard Jones wrote:
 On Tue, 26 Jun 2001 05:22, Michael R. Bernstein wrote:
  On 25 Jun 2001 10:26:10 -0400, Shane Hathaway wrote:
   According to management, there's a zope-license list somewhere and we
   expect to move to a GPL compatible license. Paul says:
  
   I think the goal should be for Zope and Python to converge on the same
   license, with perhaps the new license being some off-the-shelf license
   like Apache's.
 
  Hmm. So a BSD style license, then. Are there currently any Zope-derived
  distributions that are proprietary (third-party or DC's)?
 
 Absolutely! We use Zope as a core component in our product that's about to 
 hit the shelves.

I guess the question is whether your product is simply a combined
distribution of Zope and a proprietary product, or if you've made
changes to Zope itself.

  If not, does DC anticipate there being this kind of third-party
  proprietary derived distribution in the future?
 
 Absolutely! We have several products in mind that are based on Zope.

Again, are these products making proprietary changes to Zope itself, or
simply creating proprietary products and other add-ons to Zope?

 We will be distributing the entirety of the source code of all open-source 
 components of our product. We cannot distribute the source code of our 
 product - that would be sheer foolishness. We've invested about 2 man-years 
 in the code, and we're not about to just give that away. Our investors would 
 string us up!

Is your product a 'Zope Product'? If so, I think that's perfectly
acceptable, and Zope's license should certainly allow such. Perhaps the
LGPL for Zope would work.

Michael Bernstein.



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread R.


On 26 Jun 2001 10:29:39 +1000, Anthony Baxter wrote:
 
  Michael R. Bernstein wrote
  Unless I've misunderstood something (which is certainly possible), DC
  doesn't seem to have anything to lose by switching from a BSD style
  license to the GPL (or a GPL style license with an additional optional
  attribution clause), and quite a bit to gain.
 
 They will probably lose developer mindshare. Given how important 
 this is to Zope's growth (and to DC's growth, as a result), this 
 is far far more important than the karma from switching to the 
 far less flexible GPL

You're right. I hadn't considered that the ZPL needs to be 'proprietary
compatible' so far as add-on products are concerned. perhaps the LGPL
would suffice, as that would permit creating proprietary Zope products.
But I won't be entirely happy if the ZPL permits proprietary third-party
redistributions of Zope itself.

 Your argument seems to be that DC would want to control other companies
 ability to make distributions derived from Zope - unless they've been 
 hiding this nefarious plan from the community, this doesn't seem to
 be an objective for them.

Heh. I guess I shouldn't have stuck that in there. An argument I've
occasionally heard for BSD-style licenses is that the original (usually
corporate) author wants to be able to make proprietary releases based on
other peoples contributions. The argument for NPL-style licenses is that
they (the original author) want to be the *only* one with such a
privileged position. DC has never indicated that either of these was
important to them.

 As far as a contributor to Zope wanting to keep their work free, then
 if the ZPL is GPL compatible, they can make their components GPLd.

True.

Michael Bernstein.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: ZPL and GPL

2001-06-26 Thread Barry A. Warsaw



 FWH == Fred Wilson Horch [EMAIL PROTECTED] writes:

FWH But it would be nice to hear what Guido thinks, and what
FWH Digital Creation thinks.

I won't speak on behalf of DC, but I'll bet Guido is pretty tired of
talking about it. :)

FWH Knowing that the copyright holders have made a conscious
FWH decision not to allow developers to obtain Python and Zope
FWH under the terms of the GPL in the belief that this allows
FWH people to do whatever they want with it does help us
FWH evaluate the long-term prospects for these systems in the
FWH marketplace.

I'm not sure what point you're making.

With respect to Python, the issue has been hashed to death over in
c.l.py and other forums, so I think this will be my last post on the
subject here.  IMO, the Python 2.0.1 license is the best of all
possible worlds.  In the words of the FSF themselves:

The License of Python 2.0.1, 2.1.1, and newer versions. 
  This is a free software license and is compatible with the GNU GPL.

Dual licensing (a la Perl) has practical problems, which have been
raised in other forums, and you really want to avoid it if possible.
Python 2.0.1's license allows Python to be linked with GPL'd software
such as GNU readline.  I don't see what advantages allowing
developers to obtain Python [...] under the terms of the GPL would
provide above and beyond that.  Guido (and now, really the PSF) is
clearly not concerned about freeloaders taking of Python and not
contributing back, which is about the only additional thing a GPL
release of Python could prevent.

Any Python module or extension you write can now be legally released
under the GPL and linked with Python.  So if you feel that the GPL
affords your code useful benefits and protections, you now have that
option, whereas under Python 1.6, 2.0, or 2.1 you didn't.

-Barry

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread abel deuring


Chris McDonough wrote:
 
 Yikes.  I wonder if this overhead comes from Vocabulary updates... thanks
 very much for doing this test.

No, this should definetely _not_ be related to vocabulary: I simply
copied an already indexed document and let ZCatalog.catalog_object munge
the copy. So all words appearing in this copy already have an entry in
the Vocabulary. I also checked it during a test without meta data: The
vocabulary doed not increase.

 
 Clearly we need to pin it down.  This is very disappointing.  :-(  Any
 further info you dig up is appreciated.

Well, I don't have any at present. But allow me to make some guess :) If
a new record is added to a BTree, is can be necessary to move a few
other records around in order to keep the tree balanced. And some of the
BTrees affected by my test are definitely somewhat larger, because I did
not use German stop words during the test, so words like und, der,
die are indexed which appear in _every_ document. (well, at least in
_nearly_ every document)

 
 You didn't have any metadata stuff set up, did you?  I imagine even if you
 did, that they couldn't possibly account for 200K worth of extra stuff.

Ouch, I forgot about the meta data. So here is the result of another
test, with all meta data thrown away:

Packed data base size, one document (same during the last test) to be
cataloged: 
229170221 bytes.

data base size after updating the catalog run: 229310316 bytes
size after packing: 229172566 bytes

So, same as before :(

Abel

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread R.


On 25 Jun 2001 21:54:16 +0200, Jerome Alet wrote:
 On Mon, Jun 25, 2001 at 12:22:32PM -0700, Michael R. Bernstein wrote:
  
  Other than keeping the door open for this eventuality, is there any
  other reason to choose a BSD style license over the GPL?
  ...
  Unless I've misunderstood something (which is certainly possible), DC
  doesn't seem to have anything to lose by switching from a BSD style
  license to the GPL (or a GPL style license with an additional optional
  attribution clause), and quite a bit to gain.
 
 I personnally would love to see both Python and Zope be GPLed.
 
 However we should take into consideration the fact that this would 
 mandate that any Zope product should be GPLed too, since in the FSF
 view we link them to Zope.

Did anyone ever get an 'official' statement to that effect? Specifically
that creating a Zope Product that subclasses Zope base classes would
require the product to be GPL'd? What about the LGPL?

 The same for Python C extensions, we would link them to a GPLed software 
 (Python), so they would have to be GPLed too.
 
 That's why I'm pretty sure that unfortunately both Zope and Python 
 would loose supporters if they were GPLed.

This makes sense.

Michael.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread R.


On 26 Jun 2001 09:46:09 +0200, Erik Enge wrote:
 On Tue, 26 Jun 2001, Jerome Alet wrote:
 
  For Zope it's not sure, but for Python, as well as for all what people
  usually call open source languages, the license of choice should be
  the GPL, or at least the LGPL, in order for the language in question
  to not become bastardized by some powerful entity.
 
 I can't see this happening to that entitys success.  Could you give me an
 example of something like that happening in the past?

Microsoft's proprietary version of Kerberos. Kerberos was licensed under
a BSD-style license.

Michael Bernstein.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalogbloat problem (berkeleydb is a solution?))

2001-06-26 Thread Morten W. Petersen


On Tue, 26 Jun 2001, Stephan Richter wrote:

 - A simple DTML Zope programmers costs are okay and maybe below programmer 
   average.
 - A good Zope/Python programmer will cost above average.
 - A good Zope/Python System-Designer is very expensive.
 
 Because of that you try to minimize the Designer's time by providing a nice 
 tool (UML tool, such as ObjectDomain). Then you try to minimize the Python 
 Programmers time by auto-generating the framework and only make him to fill 
 the methods with life. Now, because we have a UML diagram, the DTML 
 programmer can start right away with programming the DTML and HTML around 
 the data/functional model, since the API is clear. This way you optimized 
 several things:

If DTML programming / interface design is so simple, and cheap, why not
automate it?  (Strike two for low cost development).

I've been trying to save some time (and my fingers!) by building a RAD
framework, named the WarpFramework [1], which deals with the low level
complexity of properties and how to display / manipulate them in addition
to other common programming tasks.. this could perhaps blend in easily..

Say for example that we could provide the odd designer with the
possibility of simply pushing widgets and displays (generated
automatically of course) around the page, and changing colors and
backgrounds (CSS); then we have a tool that designers would love as well.

 - Minimize the time of the expensive people.

Minimize the time of people.  Period.  :-)

. o O ( Many small rivers make.. )

 - Minimize the development time, since many people can work parallel.

..lessen the complexity and add to robustness.

 - Because of the above, you minimize risk and money spent. And voila, you 
   have a well functioning RAD team.*
 
 * This assumes that your team works together well. ;-)

Well, somebody's got to do something, eh?  ;-)

[1] http://www.sourceforge.net/projects/warp-framework

Regards from France,

Morten





___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread R.


On 26 Jun 2001 10:30:06 -0400, Barry A. Warsaw wrote:
 
  JA == Jerome Alet [EMAIL PROTECTED] writes:
 
 JA For Zope it's not sure, but for Python, as well as for all
 JA what people usually call open source languages, the license
 JA of choice should be the GPL, or at least the LGPL, in order
 JA for the language in question to not become bastardized by some
 JA powerful entity.
 
 I think I'm accurately channeling Guido when I say that Python will
 never be GPL'd.  AFAIK, there is no GPL code even in the standard
 Python distribution.  Both of those states of affair are by conscious
 decision: regardless of what you think of the GPL (and I personally
 happen to believe it can be a good license for /some/ software, but
 not all) GPL'ing Python would be a very bad thing.  Guido has always
 intended for people to do whatever they want with Python, including
 using it in everything from closed source, proprietary, big-$$$
 software to completely free software.

I guess I don't understand how licensing Python under the GPL would
prevent people from writing proprietary software in Python.

Compiling a program using gcc doesn't require that the program be GPL'd.

Michael Bernstein.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough


 Chris McDonough wrote:
 
  Yikes.  I wonder if this overhead comes from Vocabulary updates...
thanks
  very much for doing this test.

 No, this should definetely _not_ be related to vocabulary: I simply
 copied an already indexed document and let ZCatalog.catalog_object munge
 the copy. So all words appearing in this copy already have an entry in
 the Vocabulary. I also checked it during a test without meta data: The
 vocabulary doed not increase.

OK, that's good to know...

 
  Clearly we need to pin it down.  This is very disappointing.  :-(  Any
  further info you dig up is appreciated.

 Well, I don't have any at present. But allow me to make some guess :) If
 a new record is added to a BTree, is can be necessary to move a few
 other records around in order to keep the tree balanced. And some of the
 BTrees affected by my test are definitely somewhat larger, because I did
 not use German stop words during the test, so words like und, der,
 die are indexed which appear in _every_ document. (well, at least in
 _nearly_ every document)

 
  You didn't have any metadata stuff set up, did you?  I imagine even if
you
  did, that they couldn't possibly account for 200K worth of extra stuff.

 Ouch, I forgot about the meta data. So here is the result of another
 test, with all meta data thrown away:

 Packed data base size, one document (same during the last test) to be
 cataloged:
 229170221 bytes.

 data base size after updating the catalog run: 229310316 bytes
 size after packing: 229172566 bytes

 So, same as before :(

Well, I'm sort of stumped without doing it myself, and I can't at the
moment.  I'm going to add this to the Collector so I don't forget, and
hopefully it will be looked into and fixed by the time that 2.4.0 goes out.

Thanks so much,

- C



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Eric Roby



Erik Enge wrote on 26 June:
 If your application can't be written in five minutes and you expect to use
 it more than once, you shouldn't use ZClasses - IMO.  The only argument
 for ZClasses (that I had at the time) was that it was very easy and fast
 to set up a couple of classes and some instances.  After I wrote mk-zprod,
 making Python Products is even faster than ZClasses, and certainly scales
 better.

Thank you for your thoughts.  The comments throughout this thread have been
very insightful.  I see where I need to go from here.  I will be checking
out mk-zprod. I have a nagging question that you might be able to help me
with (in light of mk-zprod).  I understand that the 'Class-Id that a ZClass
gets assigned upon creations is vital to Zope's ability to manage class
instances.  Is it possible to make a Python class to replace the ZClasses I
have already created and be able to support the ZClass instances I have
already created?  If so, how do you get a Python class to answer to a
specific Class-Id.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread A.J. Rossini


 MRB == Michael R Bernstein Michael writes:

MRB On 26 Jun 2001 10:30:06 -0400, Barry A. Warsaw wrote:
  JA == Jerome Alet [EMAIL PROTECTED] writes:
 
JA For Zope it's not sure, but for Python, as well as for all
JA what people usually call open source languages, the license
JA of choice should be the GPL, or at least the LGPL, in order
JA for the language in question to not become bastardized by some
JA powerful entity.
 I think I'm accurately channeling Guido when I say that Python
 will never be GPL'd.  AFAIK, there is no GPL code even in the
 standard Python distribution.  Both of those states of affair
 are by conscious decision: regardless of what you think of the
 GPL (and I personally happen to believe it can be a good
 license for /some/ software, but not all) GPL'ing Python would
 be a very bad thing.  Guido has always intended for people to
 do whatever they want with Python, including using it in
 everything from closed source, proprietary, big-$$$ software to
 completely free software.

MRB I guess I don't understand how licensing Python under the GPL
MRB would prevent people from writing proprietary software in
MRB Python.

Here's a case in agreement with the above:

There's a statistical language, R, whose implementation is
GPL'd.  Recently, a research organization in Australia (who shall
remain nameless) starting selling a binary package for it to do
microarray analysis.  So, value-added software, and the question was
whether it violated the GPL.  Current thinking (as well as that of the
R-core team) was to state that if they wanted to profit, fine, as long
as they didn't build using GPLd header files (and the core team
promptly LGPL'd the headers).

best,
-tony

-- 
A.J. RossiniRsrch. Asst. Prof. of Biostatistics
U. of Washington Biostatistics  [EMAIL PROTECTED]
FHCRC/SCHARP/HIV Vaccine Trials Net [EMAIL PROTECTED]
 (wednesday/friday is unknown) 
FHCRC: M-Tu : 206-667-7025 (fax=4812)|Voicemail is pretty sketchy/use Email
UW:Th   : 206-543-1044 (fax=3286)|Change last 4 digits of phone to FAX


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread Erik Enge


On 26 Jun 2001, Michael R. Bernstein wrote:

 DC has been up-fron about how they make money. They do so by selling
 development services using Zope as a toolkit/platform.

Yes, and forcing those paying customers to use GPL is very hard (and not
very nice, either).
 
 Well, I guess the issue is whether you think that redistribution of a
 proprietary version of Zope itself is a good or bad thing.

No, that's not the issue, since I don't believe there will ever be a large
successfull proprietary version of Zope.  I think that is where we differ
in opinions.  Which is something that can only be tested by applying time
on it :).

 As a possible scenario, let's suppose that someone wanted to create a
 content mangement solution for the southeast asian market.

I just don't think it would be very successfull.  Zope isn't the type of
application that would be great as a closed-source one.  I just can't see
that happen; maybe I'm too naive.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] TransParentFolder reverseable?

2001-06-26 Thread Christian Theune


hi again ...

i would like to know, how it would be possible, to reverse
the function of the TransparentFolders, because i wan't to
use them as a patch-like function to existing zope trees.

Scenario:

You have a Zope Application, that uses more than 1000 objects
in the zope tree (e.g. /main) which you did sell to many customers.
Every customer also had the choice to get some minor modifications
to this main tree.

Now I thought i could have the main tree, and use the TransparentFolders
to FIRST look into these, and after not finding anything in them 
try the main tree.

You could have a set of Folders as incremental updates to the main tree
that are a bit more atomic.

Any hints? At most, i require more information of the process of
finding an object, to know where to threshold the alternative
search procedure.

Tnx and good night.

Christian Theune


-- 
Christian Theune - [EMAIL PROTECTED]
gocept gmbh  co.kg - schalaunische strasse 6 - 06366 koethen/anhalt
tel.+49 3496 3099112 - fax.+49 3496 3099118 mob. - 0178 48 33 981

reduce(lambda x,y:x+y,map(lambda x:chr(ord(x)^42),tuple('zS^BED\nX_FOY\x0b')))

 PGP signature

Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Casey Duncan


Chris McDonough wrote:
 
  Chris:
 
  I am working on getting a decent query language for ZCatalog/Catalog and
 
 Very cool...
 
  I have been able to make good progress, however I am running into a bit
  of an issue that I thought you might know something about:
 
  In order to implement a != query operator, I am trying to do the
  following:
 
 Tricky.
 

Ok, I was able to get it to work by instantiating a IISet around
_unindex.keys() and passing that to difference (Thanks!), however, I
notice an interesting side effect. Let's say you have a TextIndex on
title and you do the following query:

title != 'foo*'

Which to me means: all cataloged objects whose title do not match the
substring 'foo*'

However, this is not what you get exactly, instead you get:

all cataloged objects that have a non-empty title that does not match
the substring 'foo*'

Because from what I am seeing, objects with empty (or no) titles are not
included in the index *at all*. So the set of all objects does not
include ones without titles. I could fix this by making all objects be
instead All objects in the catalog (via catalog.data.keys()) instead
of all objects in the index, but I wanted to see if anyone had
additional thoughts about this.

-- 
| Casey Duncan
| Kaivo, Inc.
| [EMAIL PROTECTED]
`--

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Michel Pelletier


On Tue, 26 Jun 2001, Casey Duncan wrote:

 Ok, I was able to get it to work by instantiating a IISet around
 _unindex.keys() and passing that to difference (Thanks!), however, I
 notice an interesting side effect. Let's say you have a TextIndex on
 title and you do the following query:

 title != 'foo*'

 Which to me means: all cataloged objects whose title do not match the
 substring 'foo*'

 However, this is not what you get exactly, instead you get:

 all cataloged objects that have a non-empty title that does not match
 the substring 'foo*'

 Because from what I am seeing, objects with empty (or no) titles are not
 included in the index *at all*. So the set of all objects does not
 include ones without titles. I could fix this by making all objects be
 instead All objects in the catalog (via catalog.data.keys()) instead
 of all objects in the index, but I wanted to see if anyone had
 additional thoughts about this.

Hmm the reason for the current behavior was optimization by saving space
not indexing empty values.  The problem with your latter aproach is that
all objects in the catalog may include object that don't have a title
attribute at all.

I'm not against indexing empty values though.

-Michel


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalogbloat problem (berkeleydb is a solution?))

2001-06-26 Thread Erik Enge


On Tue, 26 Jun 2001, Stephan Richter wrote:

 I looked at the code pretty quick. I like it from the first view. It
 is very clean and easy to see the functionality. I think, if you can
 define an ZPI for your communication, then it will be no problem to
 put a SmartWizard Class Generator Front-End on top of this product.
 
What is ZPI?

 Well, the SmartWizard would be like the frontend for it. If you want
 to see an early non-SmartWizard-framework version, look at the
 ProiektorInstaller.  Just imagine you can build installers like that
 in Zope. I hope to be done with the first version on Thursday.

Great, will you let us know?
 
 I will upload the documents tomorrow though, since it is late here and
 I have to do some work still.

Ok.  I'll begin thinking about all the stuff I dreamt of making mk-zprod
into.
 
 I think that we will be able to combine the two products easily, once the 
 SmartWizard is in a semi-stable state.

Sounds good to me :)


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread Anthony Baxter



 Michael R. Bernstein wrote
 I guess I don't understand how licensing Python under the GPL would
 prevent people from writing proprietary software in Python.

embedded or frozen python. I know I'd much rather see Python embedded
in applications than Tcl or (god help us all) Javascript/ECCCHMAScript.
I can't see cisco agreeing to opensource IOS so that they can embed a
decent language in it.


-- 
Anthony Baxter [EMAIL PROTECTED]   
It's never too late to have a happy childhood.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Hey Chris, question for you

2001-06-26 Thread Chris McDonough


Hi casey,

Changes were recently made to Field/Keyword Indexes so that they will
store empty items.  An equivalent change could be made to TextIndexes...
we'd need to think about that a bit.

But for your purposes, you might want to start out attempting to write
your operator implementation using Field and Keyword indexes...

- C


Michel Pelletier wrote:
 
 On Tue, 26 Jun 2001, Casey Duncan wrote:
 
  Ok, I was able to get it to work by instantiating a IISet around
  _unindex.keys() and passing that to difference (Thanks!), however, I
  notice an interesting side effect. Let's say you have a TextIndex on
  title and you do the following query:
 
  title != 'foo*'
 
  Which to me means: all cataloged objects whose title do not match the
  substring 'foo*'
 
  However, this is not what you get exactly, instead you get:
 
  all cataloged objects that have a non-empty title that does not match
  the substring 'foo*'
 
  Because from what I am seeing, objects with empty (or no) titles are not
  included in the index *at all*. So the set of all objects does not
  include ones without titles. I could fix this by making all objects be
  instead All objects in the catalog (via catalog.data.keys()) instead
  of all objects in the index, but I wanted to see if anyone had
  additional thoughts about this.
 
 Hmm the reason for the current behavior was optimization by saving space
 not indexing empty values.  The problem with your latter aproach is that
 all objects in the catalog may include object that don't have a title
 attribute at all.
 
 I'm not against indexing empty values though.
 
 -Michel
 
 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?))

2001-06-26 Thread Stephan Richter



What is ZPI?

Typo: Read API

  Well, the SmartWizard would be like the frontend for it. If you want
  to see an early non-SmartWizard-framework version, look at the
  ProiektorInstaller.  Just imagine you can build installers like that
  in Zope. I hope to be done with the first version on Thursday.

Great, will you let us know?

Of course. ;-)

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development  Technical Project Management


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?))

2001-06-26 Thread Andy McKay


 What is ZPI?
 
 Typo: Read API

Its the Zope version of an API :)

Cheers.
--
  Andy McKay.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?))

2001-06-26 Thread Stephan Richter


At 06:40 PM 6/26/01 -0700, Andy McKay wrote:
  What is ZPI?
 
  Typo: Read API

Its the Zope version of an API :)

If you want to see it. Yeah, we just created another three-letter acronym 
for this world!!! So everyone, it is not anymore API, but ZPI...geez, I am 
starting to get silly, that means I need some sleep...

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development  Technical Project Management


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: Something better than ZClasses (was: Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?))

2001-06-26 Thread Stephan Richter



  I will upload the documents tomorrow though, since it is late here and
  I have to do some work still.

Ok.  I'll begin thinking about all the stuff I dreamt of making mk-zprod
into.

Okay, okay...I stayed up and typed it down pretty quick (2 hours). I 
attached it to this mail. It is plain text, since I was too lazy to do it 
in HTML. It might be a little unstructured, but I am too tired to fix that now.

Regards,
Stephan

--
Stephan Richter
CBU - Physics and Chemistry Student
Web2k - Web Design/Development  Technical Project Management

SmartWizards - A framework to generate the Every-Day-Wizard
===
by Stephan Richter in June 2001

Version 0.1


The following parts of a *default* WizardPage are defined:

+-+--+
| |  |
|Logo |   Header and Long Description|
| |  |
+-+--+
| |  |
|  Wizard |Wizard|
| |  |
|  Overview   | Main |
| |  |
| |Window|
| |  |
| |  |
| |  |
| |  |
| |  |
+-+--+
|  Status Messages   |
++
|  Navigation Bar (buttons to move)  |
++


Therefore there are the following standard methods defined in a wizard:

wizardHeaderwizardStatus
wizardFooterwizardNavigationBar
wizardOverview  wizardMainWindow
wizardDescription


Functionalities of these methods:
=

wizardOverview:
---
  - should display a list of all pages (display short description/title)
  - needs to know about the active page to highlight it

  Default Output: A numbered list of all the pages with the active one 
  highlighted.


wizardDescription:
--
  - shows the 'long' description of the active page
  - there might be also a static part, depending on the application


wizardStatus:
-
  - status messages/information
  o Errors by form validation
  o administrative messages (required fields, ...)
  o wizard information (page x out of y)


wizardNavigation:
-
  - there should be only buttons here!
  - maybe we should define where the buttons are, since grouping some
of them will be necessary


wizardMainWindow:
-
  - here goes the real page information
  - there can be: forms, other actions, information and everything mixed



Classes
===

SmartWizard -- Folder
--

  - we are going to use Sessions (CoreSessionTracking) and Versions to keep track of 
the
user's status.
  - Since it will be too hard to give all information first and then commit everything 
at the end, we decided to use versions, which we can always not commit, if a roll-
back is requested.
  - All the other info is saved in the session, since several people at once could use
the wizard at once.
  - Also, we will keep track of the active page by simply storing the index of the 
page 
inside the Pages folder; since it is an OrderedFolder we can do this safely.

  - methods: wizardHeader 
 wizardFooter
 wizardCSS
 wizardStatus
 wizardNavigationBar
 wizardOverview
 wizardMainWindow
 wizardDescription
 wizardSession - Contains all the session data information
 Object is of type SessionDataManager.   
 Versions  - Folder that contains versions of the people which use the 
wizard.
 Object is of type Folder.
 Pages - The container of all the WizardPages that being displayed 
during 
 the Wizard.
 Object is of type OrderedFolder (see Zope.org).

  - attributes: activePageIndex - specific value is stored in the session

[Zope-dev] Re: ZPL and GPL

2001-06-26 Thread Fred Wilson Horch


I apologize in advance to those on the list for whom this lowers the
signal to noise ratio.  I hope by now you've killed the subject line and
you won't even see this if you're not interested.

I sill haven't heard where the proper forum is for discussing Zope
licensing issues; here's my last post on the subject to zope-dev.

Fred

Barry wrote:
 
 FWH Knowing that the copyright holders have made a conscious
 FWH decision not to allow developers to obtain Python and Zope
 FWH under the terms of the GPL in the belief that this allows
 FWH people to do whatever they want with it does help us
 FWH evaluate the long-term prospects for these systems in the
 FWH marketplace.
 
 I'm not sure what point you're making.

That IMHO the copyright holders are unwittingly consigning Python and
Zope to remain niche products.

Perl and ruby (two competitors to Python -- one with a much larger
installed base and the other with arguably superior features) both offer
developers the choice to obtain the language under the GPL.

In other words, the GPL is a competitive advantage for perl and ruby
over python.

 With respect to Python, the issue has been hashed to death over in
 c.l.py and other forums, so I think this will be my last post on the
 subject here.

I'll follow suit (this will be my last post on the subject here) with
these final thoughts:

1.  The GPL offers developers and end users a high comfort level.  You
can argue its shortcomings, but many influential people view it as the
gold standard for free software licenses.  In particular, since it is
widely used, championed and studied, as legal counsel I don't have to
strain my brain when my client asks me what can or cannot be done with
software that is licensed under the GPL: I've already done the research
and have the answer ready.  This is not the case with the Python license
or the ZPL.

2.  Companies should offer their customers freedom of choice.  Most
companies have found that the practical problems raised by offering
customers licensing choices are outweighed by practical benefits,
including increased market share and profits.

3.  Established companies are running scared of the GPL with good
reason.  Have you talked to a Microsoft lobbyist recently?  New entrants
with innovative business strategies may stand to gain at the expense of
established software companies who base their business around keeping
secrets and prohibiting people from sharing computer files.

Finally, I'd like to respond to some specific points Barry raises:

  IMO, the Python 2.0.1 license is the best of all
 possible worlds.  In the words of the FSF themselves:
 
 The License of Python 2.0.1, 2.1.1, and newer versions.
   This is a free software license and is compatible with the GNU GPL.

This quote is from http://www.gnu.org/philosophy/license-list.html which
states in relevant parts:

The GNU General Public License, or GNU GPL for short. 
 This is a free software license, and a copyleft license.  We
recommend
 it for most software packages. 

The License of Python 1.6a2 and earlier versions. 
 This is a free software license and is compatible with the GNU GPL.
 Please note, however, that newer versions of Python are under other
 licenses (see below). 
The License of Python 2.0.1, 2.1.1, and newer versions. 
 This is a free software license and is compatible with the GNU GPL.
 Please note, however, that intermediate versions of Python (1.6b1,
 through 2.0 and 2.1) are under a different license (see below). 

Notice two things:

1) FSF explicitly recommends using the GPL; they not not explicitly
recommend using the Python license.

2) Python 1.6 screwed up the Python license.

 Dual licensing (a la Perl) has practical problems, which have been
 raised in other forums, and you really want to avoid it if possible.
 Python 2.0.1's license allows Python to be linked with GPL'd software
 such as GNU readline.  I don't see what advantages allowing
 developers to obtain Python [...] under the terms of the GPL would
 provide above and beyond that.

The main advantage I see is that it would calm developer fears that the
people behind Python don't have a clue what they are doing when it comes
to licensing.

Consider that Guido et al jumped ship twice and along the way managed to
drag Python through a disastrous 1.6 release.  I think this accounts for
why Guido is tired of talking about licensing issues.

We all make mistakes. It's what we do about them that counts.

  Guido (and now, really the PSF) is
 clearly not concerned about freeloaders taking of Python and not
 contributing back, which is about the only additional thing a GPL
 release of Python could prevent.

I think prevent is the wrong way to think about Python.  Repair the
damage done with 1.6 and reach out to new developers are where efforts
need to go now.

The GPL can help on both counts.

 Any Python module or extension you write can now be legally released
 under the

[Zope-dev] url quote from pyton scripts

2001-06-26 Thread Magnus Heino (Rivermen)



Hi.

How can I do a url quote from a python script?

Can I somehow access the method in urllib or DT_Var or how can it be done?

--

/Magnus Heino
 

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope] Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Chris McDonough


abel deuring wrote:
 A text index (class SearchIndex.UnTextIndex) is definetely is a cause of
 bloating, if you use CatalogAware objects. An UnTextIndex maintains for

Right.. if you don't use CatalogAware, however, and don't unindex before
reindexing an object, you should see a huge bloat savings, because the
only things which are supposed to be updated then are indexes and
metadata which have data that has changed.

 each word a list of documents, where this word appears. So, if a
 document to be indexed contains, say, 100 words, 100 IIBTrees
 (containing mappings documentId - word score) will be updated. (see
 UnTextIndex.insertForwardIndexEntry) If you have a larger number of
 documents, these mappings may be quite large: Assume 10.000 documents,
 and assume that you have 10 words which appear in 30% of all documents.
 Hence, each of the IIBTrees for these words contains 3000 entries. (Ok,
 one can try to keep this number of frequent words low by using a good
 stop word list, but at least for German, such a list is quite difficult
 to build. And one can argue that many not too really frequent words
 should be indexed in order to allow more precise phrase searches)I don't
 know the details, how data is stored inside the BTress, so I can give
 only a rough estimate of the memory requirements: With 32 bit integers,
 we have at least 8 bytes per IIBTree entry (documentId and score), so
 each of the 10 BTree for the frequent words has a minimum length of
 3000*8 = 24000 bytes.
 
 If you now add a new document containing 5 of these frequent words, 5
 larger BTrees will be updated. [Chris, let me know, if I'm now going to
 tell nonsense...] I assume that the entire updated BTrees = 12 bytes
 will be appended to the ZODB (ignoring the less frequent words) -- even
 if the document contains only 1 kB text.

Nah... I don't think so.  At least I hope not!  Each bucket in a BTree
is a separate persistent object.  So only the sum of the data in the
updated buckets will be appended to the ZODB.  So if you add an item to
a BTree, you don't add 24000+ bytes for each update.  You just add the
amount of space taken up by the bucket... unfortunately I don't know
exactly how much this is, but I'd imagine it's pretty close to the
datasize with only a little overhead.
 
 This is the reason, why I'm working on some kind of lazy cataloging.
 My approach is to use a Python class (or Base class,if ZClasses are
 involved), which has a method manage_afterAdd. This method looks for
 superValues of a type like lazyCatalog (derived from ZCatalog), and
 inserts self.getPhysicalPath() into the update list of each found
 lazyCatalog.
 
 Later, a lazyCatalog can index all objects in this list. Then, then
 bloating happens either in RAM (without subtransaction), or in a
 temporary file, if you use subtransactions.
 
 OK, another approach which fits better to your (Giovanni) needs might be
 to use another data base than ZODB, but I'm afarid that even then
 instant indexing will be an expensive process, if you have a large
 number of documents.

Another option is to use a session manager, and update the catalog at
session-end.

- C

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope] Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Casey Duncan


Chris McDonough wrote:
 
 abel deuring wrote:
  A text index (class SearchIndex.UnTextIndex) is definetely is a cause of
  bloating, if you use CatalogAware objects. An UnTextIndex maintains for
 
 Right.. if you don't use CatalogAware, however, and don't unindex before
 reindexing an object, you should see a huge bloat savings, because the
 only things which are supposed to be updated then are indexes and
 metadata which have data that has changed.
 
[snip]

What if any disadvantages are there to not calling unindex_object first?
If there aren't any good ones, I think I'll be rewriting some of my own
CatalogAware code...
-- 
| Casey Duncan
| Kaivo, Inc.
| [EMAIL PROTECTED]
`--

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] CatalogAware

2001-06-26 Thread Toby Dickenson


On Tue, 26 Jun 2001 09:31:02 -0400, Chris McDonough
[EMAIL PROTECTED] wrote:

Right.. if you don't use CatalogAware, however, and don't unindex before
reindexing an object, you should see a huge bloat savings, because the
only things which are supposed to be updated then are indexes and
metadata which have data that has changed.

CatalogAware has been blamed for alot of problems. Its three
weaknesses I am aware of are:

a. Unindexing before ReIndexing causes bloat by defeating the
   catalogs change-detection tricks.

b. It uses URLs not paths, and so doesnt play right with
   virtual hosting

c. It uses the same hooks as ObjectManager to detect that
   it has been added/removed from a container
   ObjectManager, and therefore the two cant be easily
   mixed together as base classes.

All of these are fixable, and I feel a patch coming on.

Are there some deeper problems I am not aware of?

Toby Dickenson
[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: [Zope] CatalogAware

2001-06-26 Thread Chris McDonough


I actually think this about sums it up.  If you have time to look at it
Toby, it would be much appreciated.  I don't think it's a very
complicated set of fixes, its just not on the radar at the moment, and
might require some thought about backwards-compatibility.

- C


Toby Dickenson wrote:
 
 On Tue, 26 Jun 2001 09:31:02 -0400, Chris McDonough
 [EMAIL PROTECTED] wrote:
 
 Right.. if you don't use CatalogAware, however, and don't unindex before
 reindexing an object, you should see a huge bloat savings, because the
 only things which are supposed to be updated then are indexes and
 metadata which have data that has changed.
 
 CatalogAware has been blamed for alot of problems. Its three
 weaknesses I am aware of are:
 
 a. Unindexing before ReIndexing causes bloat by defeating the
catalogs change-detection tricks.
 
 b. It uses URLs not paths, and so doesnt play right with
virtual hosting
 
 c. It uses the same hooks as ObjectManager to detect that
it has been added/removed from a container
ObjectManager, and therefore the two cant be easily
mixed together as base classes.
 
 All of these are fixable, and I feel a patch coming on.
 
 Are there some deeper problems I am not aware of?
 
 Toby Dickenson
 [EMAIL PROTECTED]
 
 ___
 Zope maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope
 **   No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope-dev )

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Erik Enge


Giovanni, which Zope version are you running?

On Tue, 26 Jun 2001, Chris McDonough wrote:

 How many indexes do you have, what are the index types, and what do
 they index?  Likewise, what about metadata?  In your last message, you
 said there's about 20.  That's a heck of a lot of indexes.  Do you
 need them all?

In my installation I have about 30 or 40
Position(Text)Index/KeywordIndex/FieldIndex.  They don't bloat much, so I
don't think that's the problem.  (The problem might be that we have
different views on what bloating is, though :)


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: [Zope] Re: Zcatalog bloat problem (berkeleydb is a solution?)

2001-06-26 Thread Toby Dickenson


On Tue, 26 Jun 2001 06:45:54 -0400, Chris McDonough
[EMAIL PROTECTED] wrote:

I can see a potential reason for the problem you explain as and I
remind you that as the folder get populated, the size that is added to
each transaction grows, a folder with one hundred objects adds some
100K... It's true that normal folders (most ObjectManager-derived
containers actually) cause database bloat within undoing storages when
an object is added or removed from it.

What Chris describes would be a prudent change anyway, however I dont
think it is the root of this problem. The tranalyzer output shows the
following line for the Folder. At a length of 363 I guess it is pretty
empty. Even if this object grows to 100k (when adding the 100th item)
it is not the single biggest cause of bloat to the total transaction
size.

(incidentally, it *was* the cause of the bloat problems that led me to
develop this patched tranalyzer)

 OID: 40817 len 363 [OFS.Folder.Folder]


The following entries I do find interesting. They are all somewhat
larger that I remember seeing before.

Are you indexing *large* properties (or storing large metadata
values)? It may be interesting to see the raw pickle data for these
large objects.. my patched tranalyzer can do that too.

 OID: 37ac4 len 25537 [BTrees.IIBTree.IISet]
 OID: 37aca len 13947 [BTrees.IIBTree.IISet]
 OID: 388ff len 24707 [BTrees.IIBTree.IISet]
 OID: 37ab8 len 39336 [BTrees.IOBTree.IOBTree]
 OID: 3c610 len 33864 [BTrees.IOBTree.IOBucket]



Toby Dickenson
[EMAIL PROTECTED]

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread Hannu Krosing


Jerome Alet wrote:
 
 On Tue, 26 Jun 2001, Erik Enge wrote:
 
  On Tue, 26 Jun 2001, Jerome Alet wrote:
 
   Java comes to mind, guess who is the powerful entity ;-)
 
  I really can't see that Java has been bastardized by it, though.
 
 I was told that some java programs only run under windows, that's what I
 called bastardization.
 
 However I don't know for sure, because I don't use Java: I use a beautiful
 language instead, and it's called: Python ;-)

There sure are python programs that run only under windows too ;)

Not that I'd recommend writing them in such a way but it happens, 
especially if they are developed/debugged under windows only and/or use 
windows-specific extensions.

Banning such extensions also seems stupid, as one of main strengths of 
python is its extensibility.

And the fact that you can't use some stackless python features
reasonably 
under plain c-python does not bother me at all.

It would not bother me even if people at Transmeta would make
proprietary 
Crusoe JIT to interpret python bytecodes directly ;) 

I would say that it would make me very glad instead, even if it causes
some 
python programs make wrong assumptions and thus run prohibitively slow 
even on 1.4 GHz Athlons .

--
Hannu

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

[Zope-dev] Re: [Zope] CatalogAware

2001-06-26 Thread Jeff Sasmor



Subject: [Zope] CatalogAware


 CatalogAware has been blamed for alot of problems. Its three
 weaknesses I am aware of are:
snip 
 
 b. It uses URLs not paths, and so doesnt play right with
virtual hosting
 

*

I ran into this problem using VHMonster with my EventFolder product
and found a work-around, just for anyone who might be struggling with this

See http://www.netkook.com/Members/jeff/ef/faq/document_view#vhost

This article discusses how to use _vh_ with  VHM. 
(boy does that sound cryptic...)

Jeff Sasmor
[EMAIL PROTECTED]
www.netkook.com





___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

Re: [Zope-dev] ZPL and GPL

2001-06-26 Thread Hannu Krosing


Jerome Alet wrote:
 
 On Tue, 26 Jun 2001, Anthony Baxter wrote:
 
 
   Jerome Alet wrote
   I personnally would love to see both Python and Zope be GPLed.
 
  Why? No really. Exactly what do you gain from this? Assuming Zope's
  license becomes GPL compatible, any packages you release you can choose
  to GPL. Why do you think having the GPL is a good thing for the core
  package? Ideological reasons? How does releasing under the GPL make
  the world a better place?

Hopefully Zope will soon be considered a universally available system 
library and this will not matter any more ;)

 For Zope it's not sure, but for Python, as well as for all what people
 usually call open source languages, the license of choice should be
 the GPL, or at least the LGPL, in order for the language in question to
 not become bastardized by some powerful entity.

I see GPL as a good license for GCC and other _compiled_ languages, but 
for an interpreted language GPL or even LGPL could well be viewed as 
forcing _anything_ written in it to be forced under *GPL. Even more 
ridiculous would be the situation where pure python modules can be 
proprietary but  modules written in C must be *GPL (think picle vs
cPicle)

 The problem with plain GPL, as mentionned in my previous message, is that
 this would make a lot of people run away. However the LGPL seems to be a
 very good choice, because this wouldn't allow the core (of Python or Zope)
 to be bastardized with proprietary versions, while still allowing
 proprietary products/extensions to be created.

AFAIK the ability to be bastardized is one of main strengths of
python.
It would be extremely hard to bastardise the main python (as it requires 
you to brainwash Guido), but having proprietary (or open-source)
versions 
that behave in some ways differently, like ZODB-python that has
transactional 
persistency seems to be a feature and not a bug of Python license.

 And yes, a thounsand times yes, I use the GPL for ideological reasons,
 because I really believe this will make the world a better place.

Think global, act local may be a good slogan for software
revolutionaries 
as well ;)

 
 I've thought about the LGPL, and doesn't see any argument against it.
 

I just can't see what LGPL would mean for _whole_ works vs. libraries
(or 
lessers as they are called nowadays ;)

---
Hannu

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )

68 matches

Mail list logo