Re: [Zope-dev] Massive scalability

2001-01-17 Thread Steve Spicklemire


Hi Andy,

   I'm not sure what you mean by 'interface/way', so.. I'm going to 
guess at two possible interpretations.

1) Basically ZPatterns allows you to define classes (DataSkins)
instances of which can optionally be used to view/create/change/delete
external data through methods of the class ( + a little SkinScript ).

If your store your instance data in SQL you can use SQL queries,
masked from the application behind some generic method (e.g.,
getFooIdsWithText( textToFind )") to find the id(s) of the instance(s)
you're after.  You can then get the instance from the ZPatterns
machinery and, once gotten, display it, change it, call it's methods,
and/or delete it. The way these actions on the object interact with
the data in the external database is all defined in 'SkinScript' which is
hidden away as a PlugIn of a Rack deep inside the ZPatterns guts. At
the Zope application level you don't really *know* where/how the data
is stored. Best of all you, or your Product's customers can easily
customize that part *after* your product is plugged into *their*
application, without changing the basic application level logic
and design of your product.

It's the coolest. ;-)

2) To get folks started with moving objects from ZODB to SQL I've
found ZFormulator handy as a tool to get folks quickly up to speed
in how SQL 'works'. 

http://www.zope.org/Members/faassen/ZFormulator

If they already have ZClasses, they can use this to 'automatically' 
generate starting point queries to match their class propertysheets.
Of course... it probably won't be normalized/optimized/etc.. but
it's better than doing it all for them! ;-)

-steve

 "Andy" == Andy McKay [EMAIL PROTECTED] writes:

Andy Does ZPatterns provide a nice interface / way for storing
Andy classes in a RDBMS? I have to say using an RDBMS is not as
Andy transparent as I would like, this may may improve
Andy it. Finally a reason for me to ZPatterns...

Andy -- Andy McKay.


Andy - Original Message - From: "John Eikenberry"
Andy [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: "Michael
Andy Bernstein" [EMAIL PROTECTED] Sent: Tuesday, January 16,
    Andy 2001 3:22 PM Subject: Re: [Zope-dev] Massive scalability


 Michael Bernstein wrote:
 
  So, again: Has anyone run up against any performance or 
 other limitations regarding large numbers (hundreds of 
 thousands or more) of objects stored within the ZODB either 
 in a BTree Folder or a Rack?
 
 I was looking into the same issues recently, but for a much
 smaller set of data (5ish). In my tests
 ZPatterns/binary-trees scaled well for
Andy storage
 and retrieval. But ZCatalog did not. It was basically useless
 for partial matching searches (taking many minutes for searches
 that retrieved more than 100 matches). I was also concerned
 about the indexing overhead. It doesn't scale well when
 changing/adding many things at a time (we might have bulk
 adds/changes).
 
 I ended up deciding to go with a RDBMS backend for data storage
 with a ZPatterns interface. SkinScripts work so well for this
 that I'm actually glad I switched. It simplified my design and
 implementation immensely.
 
 --
 
 John Eikenberry [[EMAIL PROTECTED] - http://zhar.net]
 __
 "A society that will trade a little liberty for a little order
 will deserve neither and lose both."  --B. Franklin
 
 ___ Zope-Dev
 maillist - [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev ** No cross
 posts or HTML encoding!  ** (Related lists -
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )
 


Andy ___ Zope-Dev
Andy maillist - [EMAIL PROTECTED]
Andy http://lists.zope.org/mailman/listinfo/zope-dev ** No cross
Andy posts or HTML encoding!  ** (Related lists -
Andy http://lists.zope.org/mailman/listinfo/zope-announce
Andy http://lists.zope.org/mailman/listinfo/zope )



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




[Zope-dev] [Fwd: Re: [Zope-dev] Massive scalability]

2001-01-17 Thread Michael Bernstein

Forwarded to the list to maintain the thread.

 Original Message 
From: John Eikenberry [EMAIL PROTECTED]
Subject: Re: [Zope-dev] Massive scalability
To: Michael Bernstein [EMAIL PROTECTED]

Michael Bernstein wrote:

 John Eikenberry wrote:
 
 Can you tell us a bit about how many indexes (and what
 types) you were maintaining about each object? Another
 poster reported no problems with 60,000 objects.

I was testing and had a real simple setup. In addition to
the default
indexes I had 1 index on a string type (of 10 chars or less
- last names),
2000 objects indexed. The vocabulary only had the entries
for this field in
it.

Everything was nice and fast except partial searching. It
was fast enough
when ther were small numbers of matches. But as the number
of matches grew
the time it took grew along with it, at a nearly expontial
rate.

-- 

John Eikenberry
[[EMAIL PROTECTED] - http://zhar.net]
__
"A society that will trade a little liberty for a little
order
 will deserve neither and lose both."
  --B. Franklin

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-17 Thread Andy McKay


 Are you saying that Zope's startup and shutdown time is
 affected by the size of the ZODB?

Yep. Over small ZODB's you wont notice the effect until it gets large. I
found it very annoying when doing a lot of work in python and so had two
databases, one with a small amount of data and one with a lot (two sets of
test). However in the end Shane Hathaway's excellent refresh product saved
the day.

--
  Andy McKay.



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-17 Thread Andy McKay

Wow, that sounds perfect. Yes that's exactly what I was asking.

I can create an abstract data storage (SkinScript) that stores the data
anywhere, lets say for my purposes an RDBMS (but it could be ZODB etc). I
can then get and access classes (DataSkins) with no cares about the data
storage and use all the advantages an OO approach gives.

I've got to play with this stuff, this could solve my data storage
problems...

Thanks!
--

  Andy McKay.


- Original Message -
From: "Steve Spicklemire" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]; [EMAIL PROTECTED]; [EMAIL PROTECTED];
[EMAIL PROTECTED]
Sent: Wednesday, January 17, 2001 3:43 AM
Subject: Re: [Zope-dev] Massive scalability



 Hi Andy,

I'm not sure what you mean by 'interface/way', so.. I'm going to
 guess at two possible interpretations.

 1) Basically ZPatterns allows you to define classes (DataSkins)
 instances of which can optionally be used to view/create/change/delete
 external data through methods of the class ( + a little SkinScript ).

 If your store your instance data in SQL you can use SQL queries,
 masked from the application behind some generic method (e.g.,
 getFooIdsWithText( textToFind )") to find the id(s) of the instance(s)
 you're after.  You can then get the instance from the ZPatterns
 machinery and, once gotten, display it, change it, call it's methods,
 and/or delete it. The way these actions on the object interact with
 the data in the external database is all defined in 'SkinScript' which is
 hidden away as a PlugIn of a Rack deep inside the ZPatterns guts. At
 the Zope application level you don't really *know* where/how the data
 is stored. Best of all you, or your Product's customers can easily
 customize that part *after* your product is plugged into *their*
 application, without changing the basic application level logic
 and design of your product.

 It's the coolest. ;-)

 2) To get folks started with moving objects from ZODB to SQL I've
 found ZFormulator handy as a tool to get folks quickly up to speed
 in how SQL 'works'.

 http://www.zope.org/Members/faassen/ZFormulator

 If they already have ZClasses, they can use this to 'automatically'
 generate starting point queries to match their class propertysheets.
 Of course... it probably won't be normalized/optimized/etc.. but
 it's better than doing it all for them! ;-)

 -steve

  "Andy" == Andy McKay [EMAIL PROTECTED] writes:

 Andy Does ZPatterns provide a nice interface / way for storing
 Andy classes in a RDBMS? I have to say using an RDBMS is not as
 Andy transparent as I would like, this may may improve
 Andy it. Finally a reason for me to ZPatterns...

 Andy -- Andy McKay.


 Andy - Original Message - From: "John Eikenberry"
 Andy [EMAIL PROTECTED] To: [EMAIL PROTECTED] Cc: "Michael
 Andy Bernstein" [EMAIL PROTECTED] Sent: Tuesday, January 16,
     Andy 2001 3:22 PM Subject: Re: [Zope-dev] Massive scalability


  Michael Bernstein wrote:
 
   So, again: Has anyone run up against any performance or 
  other limitations regarding large numbers (hundreds of 
  thousands or more) of objects stored within the ZODB either 
  in a BTree Folder or a Rack?
 
  I was looking into the same issues recently, but for a much
  smaller set of data (5ish). In my tests
  ZPatterns/binary-trees scaled well for
 Andy storage
  and retrieval. But ZCatalog did not. It was basically useless
  for partial matching searches (taking many minutes for searches
  that retrieved more than 100 matches). I was also concerned
  about the indexing overhead. It doesn't scale well when
  changing/adding many things at a time (we might have bulk
  adds/changes).
 
  I ended up deciding to go with a RDBMS backend for data storage
  with a ZPatterns interface. SkinScripts work so well for this
  that I'm actually glad I switched. It simplified my design and
  implementation immensely.
 
  --
 
  John Eikenberry [[EMAIL PROTECTED] - http://zhar.net]
  __
  "A society that will trade a little liberty for a little order
  will deserve neither and lose both."  --B. Franklin
 
  ___ Zope-Dev
  maillist - [EMAIL PROTECTED]
  http://lists.zope.org/mailman/listinfo/zope-dev ** No cross
  posts or HTML encoding!  ** (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )
 


 Andy ___ Zope-Dev
 Andy maillist - [EMAIL PROTECTED]
 Andy http://lists.zope.org/mailman/listinfo/zope-dev ** No cross
 Andy posts or HTML encoding!  ** (Related lists -
 Andy http://lists.zope.org/mailman/listinfo/zope-announc

Re: [Zope-dev] Massive scalability

2001-01-17 Thread ender

On Tuesday 16 January 2001 20:42, Michael Bernstein wrote:

 Are you saying that Zope's startup and shutdown time is
 affected by the size of the ZODB?

AFAIK on a filestorage zope loads up the indexes (oid, file_offset?) into 
memory on start to facilitate object retrieval which impacts start up time. i 
don't think the other storages operate this way.

kapil

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-17 Thread Andy McKay

On the plus side any corrupted objects are fixed or deleted when you start
up the ZODB. For that reason, somedays a restart is very useful :)
--
  Andy McKay.


- Original Message -
From: "ender" [EMAIL PROTECTED]
To: "Michael Bernstein" [EMAIL PROTECTED]; "Andy McKay"
[EMAIL PROTECTED]
Cc: [EMAIL PROTECTED]
Sent: Wednesday, January 17, 2001 12:05 PM
Subject: Re: [Zope-dev] Massive scalability


 On Tuesday 16 January 2001 20:42, Michael Bernstein wrote:

  Are you saying that Zope's startup and shutdown time is
  affected by the size of the ZODB?

 AFAIK on a filestorage zope loads up the indexes (oid, file_offset?) into
 memory on start to facilitate object retrieval which impacts start up
time. i
 don't think the other storages operate this way.

 kapil

 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread Andy McKay

 While that would work for the simple object case, I find the
 prospect of storing a bunch of BLOBs (for the image data of
 the Photos) in an RDBMS to be *most* un-appetizing. Storing
 them on the server's file-system seems in-elegant as well.

Okey dokey, just a suggestion. I have heard people talk about large ZOBD's
but once I go over a 10,000 object mark I just find a RDBMS easier myself.
Go for it good luck!

 Does anyone know of any hidden 'gotchas' when dealing with
 this many objects, regardless of the hit-load on the system?

Mostly starting and stopping Zope, the 2gb limit (which can be avoided),
pulling objects back out with complicated queries are my biggest gripes.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread John Eikenberry

Michael Bernstein wrote:

 So, again: Has anyone run up against any performance or
 other limitations regarding large numbers (hundreds of
 thousands or more) of objects stored within the ZODB either
 in a BTree Folder or a Rack?

I was looking into the same issues recently, but for a much smaller set of
data (5ish). In my tests ZPatterns/binary-trees scaled well for storage
and retrieval. But ZCatalog did not. It was basically useless for partial
matching searches (taking many minutes for searches that retrieved more
than 100 matches). I was also concerned about the indexing overhead. It
doesn't scale well when changing/adding many things at a time (we might
have bulk adds/changes).

I ended up deciding to go with a RDBMS backend for data storage with a
ZPatterns interface. SkinScripts work so well for this that I'm actually
glad I switched. It simplified my design and implementation immensely. 

-- 

John Eikenberry
[[EMAIL PROTECTED] - http://zhar.net]
__
"A society that will trade a little liberty for a little order
 will deserve neither and lose both."
  --B. Franklin

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread Andy McKay

Does ZPatterns provide a nice interface / way for storing classes in a
RDBMS? I have to say using an RDBMS is not as transparent as I would like,
this may may improve it. Finally a reason for me to ZPatterns...

--
  Andy McKay.


- Original Message -
From: "John Eikenberry" [EMAIL PROTECTED]
To: [EMAIL PROTECTED]
Cc: "Michael Bernstein" [EMAIL PROTECTED]
Sent: Tuesday, January 16, 2001 3:22 PM
Subject: Re: [Zope-dev] Massive scalability


 Michael Bernstein wrote:

  So, again: Has anyone run up against any performance or
  other limitations regarding large numbers (hundreds of
  thousands or more) of objects stored within the ZODB either
  in a BTree Folder or a Rack?

 I was looking into the same issues recently, but for a much smaller set of
 data (5ish). In my tests ZPatterns/binary-trees scaled well for
storage
 and retrieval. But ZCatalog did not. It was basically useless for partial
 matching searches (taking many minutes for searches that retrieved more
 than 100 matches). I was also concerned about the indexing overhead. It
 doesn't scale well when changing/adding many things at a time (we might
 have bulk adds/changes).

 I ended up deciding to go with a RDBMS backend for data storage with a
 ZPatterns interface. SkinScripts work so well for this that I'm actually
 glad I switched. It simplified my design and implementation immensely.

 --

 John Eikenberry
 [[EMAIL PROTECTED] - http://zhar.net]
 __
 "A society that will trade a little liberty for a little order
  will deserve neither and lose both."
   --B. Franklin

 ___
 Zope-Dev maillist  -  [EMAIL PROTECTED]
 http://lists.zope.org/mailman/listinfo/zope-dev
 **  No cross posts or HTML encoding!  **
 (Related lists -
  http://lists.zope.org/mailman/listinfo/zope-announce
  http://lists.zope.org/mailman/listinfo/zope )



___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread Michael Bernstein

John Eikenberry wrote:
 
 Michael Bernstein wrote:
 
  So, again: Has anyone run up against any performance or
  other limitations regarding large numbers (hundreds of
  thousands or more) of objects stored within the ZODB either
  in a BTree Folder or a Rack?
 
 I was looking into the same issues recently, but for a much smaller set of
 data (5ish). In my tests ZPatterns/binary-trees scaled well for storage
 and retrieval. But ZCatalog did not. It was basically useless for partial
 matching searches (taking many minutes for searches that retrieved more
 than 100 matches)

Was this true even for cases where the batch size was
smaller than 100? For example, if a search returns over 100
results but the batch size is only 20 (so that only 20
results at a time are displayed), do you still get the
performance hit?

 [snip]
 I ended up deciding to go with a RDBMS backend for data storage with a
 ZPatterns interface. SkinScripts work so well for this that I'm actually
 glad I switched. It simplified my design and implementation immensely.

So you're saying that you are doing all searching using SQL
statements, and not just object retreival and storage,
correct? How are you handling full text searches?

Cheers,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread Michael Bernstein

Andy McKay wrote:
 
  While that would work for the simple object case, I find the
  prospect of storing a bunch of BLOBs (for the image data of
  the Photos) in an RDBMS to be *most* un-appetizing. Storing
  them on the server's file-system seems in-elegant as well.
 
 Okey dokey, just a suggestion. I have heard people talk about large ZOBD's
 but once I go over a 10,000 object mark I just find a RDBMS easier myself.
 Go for it good luck!

Thanks! I appreciate different points of view on this
problem, even if you have different 'comfort zones'.

  Does anyone know of any hidden 'gotchas' when dealing with
  this many objects, regardless of the hit-load on the system?
 
 Mostly starting and stopping Zope, [snip]

Are you saying that Zope's startup and shutdown time is
affected by the size of the ZODB?

Thanks,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread Michael Bernstein

RDM wrote:
 
 On Mon, 15 Jan 2001, Michael Bernstein wrote:
  as Squishdot). Adding a dependency on an RDBMS or requiring
  additional setup on the server's FS seems a step in the
  wrong direction.
 [...]
  So the question remains: Will either approach (within the
  ZODB) allow me to scale the application to hundreds of
  thousands (or even millions) of objects indexed in a
  ZCatalog?
 [...]
  I know that the ZCatalog/ObjectManager approach used by
  Squishdot will scale to over 9,000 objects (the number of
  postings to date at technocrat.net), So I'm reasonably
  certain that my proposed ZCatalog/BTree Folder approach will
  be at least as scalable. I'm slightly less confident about
  the Specialist/Rack approach, because I don't know of any
  sites that have used them to store that many objects in the
  ZODB, but only slightly.
 
 My understanding is that the point of ZPatterns is to hide
 the data storage implementation from the application.[snip]
 The point being that you can *change your
 mind* later, with minimal disruption to your application.  Not
 only that, but people who *use* your product can make their
 own decision about where to store the data.  So by using ZPatterns
 you [...] let the users
 of your product use an RDBMs if that works better for them.

Very good points, and ones that I will keep in mind. Thanks.

 In addition, it seems to me that your comments about ZCatalog+BTree
 apply equally well to ZPatterns, since you can use the Catalog
 to index stuff stored in a rack through the use of appropriate
 triggers, and it is my understanding that the default in-ZODB
 rack storage uses BTree internally.

I do not know if BTree folders and Racks share the same
B-Tree implementation, which is why I qualified my statement
as 'slightly less confident'.
 
 Unfortunately I don't have much input on your question about
 real-life scalability...the most I've done is stored 6 small objects
 in a hierarchy of zope folders, indexed by the catalog, with
 no perceptable slowdown in search or retrieval speed.

Hmm. John Eikenberry mentioned a slowdown with about 50,000
objects on partial-match searches, but I don't know how
simple/complex the objects were, or how many atributes were
being indexed. How many indexes of various types was your
ZCatalog maintaining on your objects?

Thanks,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread John Eikenberry

Andy McKay wrote:

 Does ZPatterns provide a nice interface / way for storing classes in a
 RDBMS? I have to say using an RDBMS is not as transparent as I would like,
 this may may improve it. Finally a reason for me to ZPatterns...
 
The best way to get a taste is to try it out. The easiest way to do this is
to install LoginManager plus the DB/DA. Then follow the instructions in...

LoginManager with SQL and Skinscript
http://www.zope.org/Members/dlpierson/sqlLogin

I had it up and running pretty quickly following these instructions.

More generally the combo of sql-methods and ZPatterns (ie. skinscripts)
seems pretty compelling. I've used SQLMethods pretty extensively for the
past 18 months.  They have definate limitations which is why I'm working on
the object based system. The ZPatterns abstraction seems to provide the
best of both worlds.  You get the nice parts of SQLMethods; timed cache,
dtml query syntax and web viewable sql querries, plus you get a nice object
abstractaction (much better than plugable brains). SkinScripts allow for
easy attribute and trigger handling, basically like a simple object
description language.

If you can't tell, I'm pretty sold on ZPatterns. And once deciding that an
RDBMS was the best way to go for data storage, it fit into the 'pattern'
very nicely. I haven't deployed it yet, but its pretty fun to work on. :)

-- 

John Eikenberry
[[EMAIL PROTECTED] - http://zhar.net]
__
"A society that will trade a little liberty for a little order
 will deserve neither and lose both."
  --B. Franklin

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-16 Thread John Eikenberry

Michael Bernstein wrote:

 John Eikenberry wrote:
  
  I was looking into the same issues recently, but for a much smaller set of
  data (5ish). In my tests ZPatterns/binary-trees scaled well for storage
  and retrieval. But ZCatalog did not. It was basically useless for partial
  matching searches (taking many minutes for searches that retrieved more
  than 100 matches)
 
 Was this true even for cases where the batch size was
 smaller than 100? For example, if a search returns over 100
 results but the batch size is only 20 (so that only 20
 results at a time are displayed), do you still get the
 performance hit?

Short answer: yes 

Long answer: If you check out the source and/or hit it with the profiler
you'll see that the way the partial search works is to first do a more
general search then to limit the hits as much as possible via regex's.
Both these steps have to happen no matter the batch size, and this is where
you take the performance hit.  

  [snip]
  I ended up deciding to go with a RDBMS backend for data storage with a
  ZPatterns interface. SkinScripts work so well for this that I'm actually
  glad I switched. It simplified my design and implementation immensely.
 
 So you're saying that you are doing all searching using SQL
 statements, and not just object retreival and storage,
 correct? How are you handling full text searches?

Yes. I'll use MySQL's built in pattern matching facilities. It can do full
text searches with partial matching, and it can do this fast.  I'm working
on a system that will return the DataSkin's in responce to the query.
Allowing me to deal with just the objects yet use all of MySQL's
facilities. 

I'v just started to work on this as part of a larger project, but I'm doing
it full time and should have something fairly soon. My company is very free
software friendly, so I'll be able to share it once its ready. If you
happen to be interested.

-- 

John Eikenberry
[[EMAIL PROTECTED] - http://zhar.net]
__
"A society that will trade a little liberty for a little order
 will deserve neither and lose both."
  --B. Franklin

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-15 Thread Andy McKay

 I am currently planning two separate 'Archive' type
 projects/Products. In both cases, I need to make sure that
 my implementation will scale to hundreds of thousands or
 even millions of objects.

I would recommend using an RDMBS behind Zope then. Its faster, simpler and I
have always had better results.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-15 Thread Michael Bernstein

Andy McKay wrote:
 
  I am currently planning two separate 'Archive' type
  projects/Products. In both cases, I need to make sure that
  my implementation will scale to hundreds of thousands or
  even millions of objects.
 
 I would recommend using an RDMBS behind Zope then. Its faster, simpler and I
 have always had better results.

While that would work for the simple object case, I find the
prospect of storing a bunch of BLOBs (for the image data of
the Photos) in an RDBMS to be *most* un-appetizing. Storing
them on the server's file-system seems in-elegant as well.

I'd like to build both of these applications as products
that can be easily installed into a Zope server (as easily
as Squishdot). Adding a dependency on an RDBMS or requiring
additional setup on the server's FS seems a step in the
wrong direction.

I'm working with objects here. I prefer to work with them
compared to the approach of decomposing my objects into
RDBMS records or files, and recomposing.

So the question remains: Will either approach (within the
ZODB) allow me to scale the application to hundreds of
thousands (or even millions) of objects indexed in a
ZCatalog?

I should stress that I am far more concerned about the
number of objects than I am about the number of 'hits'.

I know that the ZCatalog/ObjectManager approach used by
Squishdot will scale to over 9,000 objects (the number of
postings to date at technocrat.net), So I'm reasonably
certain that my proposed ZCatalog/BTree Folder approach will
be at least as scalable. I'm slightly less confident about
the Specialist/Rack approach, because I don't know of any
sites that have used them to store that many objects in the
ZODB, but only slightly.

However, so far I have not heard of anyone storing and
indexing as many homogenous objects as I am talking about.
I'd really like to hear from anyone who has attempted to do
this or something similar.

Does anyone know of any hidden 'gotchas' when dealing with
this many objects, regardless of the hit-load on the system?

Thanks,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-15 Thread Jimmie Houchin

I would like to echo Michael's sentiment and comments. I am at the
beginning stage of Zope development of a website which will have
millions of objects of a single object type and multiple of such. I
currently plan on using mounted databases for the various object
repositories and am currently exploring the best means to implement
such. I know of the mountedfilestorage product by Anthony, but with the
massive refactoring happening for 2.3 I am watching the horizon for how
such will be implemented in the future.

I think there are many who would like to keep their development within
Zope using objects. Any information on best use or development
strategies for such setups would be greatly appreciated. 

Thanks,

Jimmie Houchin


Michael Bernstein wrote:
 
 Andy McKay wrote:
 
   I am currently planning two separate 'Archive' type
   projects/Products. In both cases, I need to make sure that
   my implementation will scale to hundreds of thousands or
   even millions of objects.
 
  I would recommend using an RDMBS behind Zope then. Its faster, simpler and I
  have always had better results.
 
 While that would work for the simple object case, I find the
 prospect of storing a bunch of BLOBs (for the image data of
 the Photos) in an RDBMS to be *most* un-appetizing. Storing
 them on the server's file-system seems in-elegant as well.
 
 I'd like to build both of these applications as products
 that can be easily installed into a Zope server (as easily
 as Squishdot). Adding a dependency on an RDBMS or requiring
 additional setup on the server's FS seems a step in the
 wrong direction.
 
 I'm working with objects here. I prefer to work with them
 compared to the approach of decomposing my objects into
 RDBMS records or files, and recomposing.
 
 So the question remains: Will either approach (within the
 ZODB) allow me to scale the application to hundreds of
 thousands (or even millions) of objects indexed in a
 ZCatalog?
 
 I should stress that I am far more concerned about the
 number of objects than I am about the number of 'hits'.
 
 I know that the ZCatalog/ObjectManager approach used by
 Squishdot will scale to over 9,000 objects (the number of
 postings to date at technocrat.net), So I'm reasonably
 certain that my proposed ZCatalog/BTree Folder approach will
 be at least as scalable. I'm slightly less confident about
 the Specialist/Rack approach, because I don't know of any
 sites that have used them to store that many objects in the
 ZODB, but only slightly.
 
 However, so far I have not heard of anyone storing and
 indexing as many homogenous objects as I am talking about.
 I'd really like to hear from anyone who has attempted to do
 this or something similar.
 
 Does anyone know of any hidden 'gotchas' when dealing with
 this many objects, regardless of the hit-load on the system?
 
 Thanks,
 
 Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-15 Thread Michael Bernstein

Jimmie Houchin wrote:
 
 I would like to echo Michael's sentiment and comments. I am at the
 beginning stage of Zope development of a website which will have
 millions of objects of a single object type and multiple of such.

are your objects intended to be indexed by ZCatalog as well,
or are you planning some other method of finding your
objects?

 [snip stuff about mounted databases]
 
 I think there are many who would like to keep their development within
 Zope using objects. Any information on best use or development
 strategies for such setups would be greatly appreciated.

Thanks for adding your voice!

I also think that applications such as we're discussing
should be possible to deploy using nothing but Zope's built
in capabilities. I understand that Zope has certain
limitations in situations that require many writes to the
database, and I accept those limiattions as being a
neccessary trade-off for a storage strategy that uses an
appending file format. I am still trying to determine if
Zope (especially using the two development options I
outlined) has any built in limitations regarding the number
(or number x size) of objects stored.

I really appreciate Zope's features (especially with
ZPatterns) that allow applications to be developed that are
storage-agnostic, and I feel that this is especially useful
for tying into existing legacy systems, but I don't want to
develop a new application, storing new data, that is tied to
a specific storage methodology such as an RDBMS.

People who wish to customize an application to leverage
their existing legacy data should be free to do so, but I've
noticed that Zope products that have some external system as
a pre-requisite (Worldpilot, ZCommerce, etc.) are deployed
far less often than those which do not (Squishdot, Zwiki,
etc.).

So, again: Has anyone run up against any performance or
other limitations regarding large numbers (hundreds of
thousands or more) of objects stored within the ZODB either
in a BTree Folder or a Rack?

In other words, will the system slow down if you add enough
objects?

Thanks,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-14 Thread Steve Alexander

Michael Bernstein wrote:

 I am currently planning two separate 'Archive' type
 projects/Products. In both cases, I need to make sure that
 my implementation will scale to hundreds of thousands or
 even millions of objects.
 

 In one project the objects are very simple ZClasses with a
 few attributes, in the other project, the objects will be
 instances of the Photo Product, and considerably larger.

Do you mean "instances of the Photo Product" or "instances of class 
FooBar from the Photo Product" ?

 One implementation I'm considering is a simple Specialist
 with a Rack. Does anyone know if there are any inherent
 limitations on the number of objects that can be stored in a
 Rack? are there any performance limitations at the scale
 that I'm talking about?

Seeing as a Rack can provide data from absolutely anywhere, I can't see 
a problem with this.
If you're talking about the BTree implementation that Racks use when 
they store data in the ZODB, well, I've never stored more than a few 
thousand objects in one of those. There certainly aren't the same 
limitations that you get with the default ObjectManager, as that uses a 
python dict to hold its sub-objects.

The performance limitations will more likely be to do with searching and 
indexing the data, adding the data in bulk (if you need to do this), and 
retrieving the data if you have a vast number of clients wanting it all 
at once.

 The other implementation I'm considering is to create a
 ZClass that inherits from ZCatalog and Btree Folder.

I can't think why you'd want to do that. What role would instances of 
this class play in your application?

 Would
 this approach run into any scalability problems with the
 number and type of objects I'm talking about?

I think other aspects of your application will determine whether it will 
scale. Scalabillity is an emergent property of a system. You only get to 
know about it when you consider the system holisticly.

With Zope, where you store objects and how you plan to find objects, is 
more significant than what the objects you're storing are.

--
Steve Alexander
Software Engineer
Cat-Box limited
http://www.cat-box.net


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Massive scalability

2001-01-14 Thread Michael Bernstein

Steve Alexander wrote:
 
 Michael Bernstein wrote:
 
  I am currently planning two separate 'Archive' type
  projects/Products. In both cases, I need to make sure that
  my implementation will scale to hundreds of thousands or
  even millions of objects.
  
 
  In one project the objects are very simple ZClasses with a
  few attributes, in the other project, the objects will be
  instances of the Photo Product, and considerably larger.
 
 Do you mean "instances of the Photo Product" or "instances of class
 FooBar from the Photo Product" ?

Sorry. I meant instances of class Photo from the Photo
Product.

  Does anyone know if there are any inherent
  limitations on the number of objects that can be stored in a
  Rack? are there any performance limitations at the scale
  that I'm talking about?
 
 If you're talking about the BTree implementation that Racks use when
 they store data in the ZODB, well, I've never stored more than a few
 thousand objects in one of those. There certainly aren't the same
 limitations that you get with the default ObjectManager, as that uses a
 python dict to hold its sub-objects.
 
 The performance limitations will more likely be to do with searching and
 indexing the data, adding the data in bulk (if you need to do this), and
 retrieving the data if you have a vast number of clients wanting it all
 at once.

Yes, I was referring to the way a Rack stores data in the
ZODB.

Photo instances store several sizes of the same image as
attributes of the object, as well as various meta-data
fields. I anticipate indexing the meta-data in a ZCatalog.

Data will not be added in bulk, but several people may want
to retreive the data at the same time (if the site becomes
popular).

  The other implementation I'm considering is to create a
  ZClass that inherits from ZCatalog and Btree Folder.
 
 I can't think why you'd want to do that. What role would instances of
 this class play in your application?

The same as the Rack. A single archive of indexed objects.
It seems that this would scale better than creating a ZClass
that inherits from ZCatalog and ObjectManager as described
here:

http://www.zope.org/Members/tseaver/inherit_ZCatalog

  Would
  this approach run into any scalability problems with the
  number and type of objects I'm talking about?
 
 I think other aspects of your application will determine whether it will
 scale. Scalabillity is an emergent property of a system. You only get to
 know about it when you consider the system holisticly.

The system is fairly simple: I want to store a large number
of objects in a single location, I want to index them in a
ZCatalog, I want to find objects by either searching for
them, or by browsing a ZTopics based heirarchy (that is
populated with ZCatalog searches as well).

The search time (whether a user or ZTopic initiates it)
should happen fairly fast, regardless of the number of
objects (potentially hundreds of thousands), and direct
object retreivals (or rendering) should also happen quickly,
without major penalties for the number of objects.

 With Zope, where you store objects and how you plan to find objects, is
 more significant than what the objects you're storing are.

I hope I've explained myself better this time.

Thanks for the help,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




[Zope-dev] Massive scalability

2001-01-13 Thread Michael Bernstein

I am currently planning two separate 'Archive' type
projects/Products. In both cases, I need to make sure that
my implementation will scale to hundreds of thousands or
even millions of objects.
In one project the objects are very simple ZClasses with a
few attributes, in the other project, the objects will be
instances of the Photo Product, and considerably larger.

One implementation I'm considering is a simple Specialist
with a Rack. Does anyone know if there are any inherent
limitations on the number of objects that can be stored in a
Rack? are there any performance limitations at the scale
that I'm talking about?

The other implementation I'm considering is to create a
ZClass that inherits from ZCatalog and Btree Folder. Would
this approach run into any scalability problems with the
number and type of objects I'm talking about?

Thanks,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )