Re: [Zope-dev] Specialist/Rack scalability

2001-01-22 Thread Michael Bernstein

"Phillip J. Eby" wrote:
> 
> Just to expand a little on the abov...  Racks should scale at least as
> well, if not larger than a ZCatalog, given the same storage backing for the
> ZODB.  This is because ZCatalog has to manage a minimum of one forward and
> reverse BTree for *each* index, plus another few BTrees for overall storage
> and housekeeping.  Also, keyword and full text indexes store multiple BTree
> entries per object, so that's a factor as well.
> 
> So don't worry about the Rack.  If you're using a Rack, you can store the
> data anywhere, and you can index it in an RDBMS, LDAP directory, ZCatalog,
> or some combination thereof, using triggers to keep the data in sync.

Thanks Philip, that's reassuring. I guess now I need to make
certain that the ZCatalog can scale as far as I need it to.

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Phillip J. Eby

At 07:12 PM 1/21/01 +, Steve Alexander wrote:
>
>So, storing things in a Rack happens in a number of stages:
>
>   Your application interacts with the Rack
>   The Rack (perhaps) stores the object persistently in its BTree
>   The BTree is a collection of persistent ZODB objects
>   The ZODB objects are stored as Python Pickles in a FileStorage
>
>We can consider what the effect of storing 60 000 objects is at each of 
>these interfaces.
>
>The Rack shouldn't have a problem with 60 000 objects.
>
>I doubt a BTree would have a problem.
>
>The ZODB might not like accessing many large objects during a single 
>transaction, as all those objects need to be in memory at once.
>
>A FileStorage should have no problem reading 60 000 stored objects. 
>However, if these objects are changing much, your Data.fs will grow 
>fast. In any case, you may find undo and history screens take a long 
>time to appear.
>
>However, if you are using a Rack, you have a lot of choice about where 
>you put your data. You can put frequently changed aspects of your data 
>on the filesystem, and the rest in FileStorage for example.

Just to expand a little on the abov...  Racks should scale at least as
well, if not larger than a ZCatalog, given the same storage backing for the
ZODB.  This is because ZCatalog has to manage a minimum of one forward and
reverse BTree for *each* index, plus another few BTrees for overall storage
and housekeeping.  Also, keyword and full text indexes store multiple BTree
entries per object, so that's a factor as well.

So don't worry about the Rack.  If you're using a Rack, you can store the
data anywhere, and you can index it in an RDBMS, LDAP directory, ZCatalog,
or some combination thereof, using triggers to keep the data in sync.


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Steve Alexander

Michael Bernstein wrote:
 >

>> Make sure that each large attribute is an instance of a class that
>> derives from Persistent.
> 
> Ok, I'll give that a try. Since Photo is a Python Product,
> what will happen to current instances if I make this (and
> only this) change?

I don't know. I can think of reasons that it might be ok. I can also 
rationalize why it would cause badness. :-)

 
>> [[ put images in their own specialist ]

>
> I'm not certain that that makes sense, since the Images are
> really cached 'views' of the Photo object. When a new image
> is uploded to replace an existing one, *all* versions
> (thumbnails, small, medium, large, etc) are regenerated.

Makes sense to me. You're not generating them on the fly; you're storing 
them persistently.

If you put them in their own Specialist and Rack or Racks, you get to

say how they are stored entirely independently of how the Photo objects
are stored.

I would have just one Images specialist, and then probably store them in 
different racks, but expose them to the rest of the application as all 
being of the same class of Image, but with a different image_size 
attribute; either "thumbnail", "small", "medium" or "large".
That way, I could make the small rack generate thumbnails from the 
medium rack if, for example, the small size was rarely requested.

There are many ways to design that though, and it depends on how you 
want things to work. (Obviously :-) )

> But assuming that I went so far as to break out the Images
> to their own Rack, would you reccomend that each image size
> have a dedicated Rack, or would you suggest that all images
> be stored in the same Rack?

There are advantages and disadvantages to each approach. However, you 
should be hiding the details of what Racks exist behind the facade of 
the Specialist.

The Specialist will have a getItem method, which will get an Image from 
the appropriate rack, and probably some methods like 
listImagesFor(photo) and getImageFor(image_type, photo) so you can get 
all the images for a particular photo.

Perhaps also storeImageFor(photo, original_image), which would end up 
processing and storing images derived from the original image.

--
Steve Alexander
Software Engineer
Cat-Box limited
http://www.cat-box.net


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Michael Bernstein

Steve Alexander wrote:
> 
> Michael Bernstein wrote:
> 
> > There is some question in my mind if
> > accessing any attribute (such as the thumbnail version)
> > causes all attributes to be loaded into memory. If so,
> > displaying a list of images with thumbnails may result in
> > many large objects being loaded into memory.
> 
> Make sure that each large attribute is an instance of a class that
> derives from Persistent.

Ok, I'll give that a try. Since Photo is a Python Product,
what will happen to current instances if I make this (and
only this) change?

> Of course, if this is a ZPatterns application, you'd probably want to
> have the images in their own Rack, and use an Attribute Provider on your
> Photo objects that gets the images for a Photo as needed. The Photo
> (with meta-data) and the images are entirely different objects, accessed
> via different Racks, via different Specialists.

I'm not certain that that makes sense, since the Images are
really cached 'views' of the Photo object. When a new image
is uploded to replace an existing one, *all* versions
(thumbnails, small, medium, large, etc) are regenerated.

But assuming that I went so far as to break out the Images
to their own Rack, would you reccomend that each image size
have a dedicated Rack, or would you suggest that all images
be stored in the same Rack?

Thanks,

Michael.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Steve Alexander

Michael Bernstein wrote:

>
> There is some question in my mind if
> accessing any attribute (such as the thumbnail version)
> causes all attributes to be loaded into memory. If so,
> displaying a list of images with thumbnails may result in
> many large objects being loaded into memory.

Make sure that each large attribute is an instance of a class that 
derives from Persistent.

Of course, if this is a ZPatterns application, you'd probably want to 
have the images in their own Rack, and use an Attribute Provider on your 
Photo objects that gets the images for a Photo as needed. The Photo 
(with meta-data) and the images are entirely different objects, accessed 
via different Racks, via different Specialists.

--
Steve Alexander
Software Engineer
Cat-Box limited
http://www.cat-box.net


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Michael Bernstein

Steve Alexander wrote:
> 
> Hi Michael,
> 
> Michael Bernstein wrote:
> 
> >
> > It seems clear that indexing and searching are more of a
> > botleneck than storage/retreival. Nevertheless, so far I
> > have not heard of anyone trying to store more than 60,000
> > objects in a rack. I need to know if there is any reason to
> > suspect that storage (in the ZODB) or retreival performance
> > would suffer if the number of objects was in the hundreds of
> > thousands or even millions.
> 
> [snip]
>
> So, storing things in a Rack happens in a number of stages:
> 
>Your application interacts with the Rack
>The Rack (perhaps) stores the object persistently in its BTree
>The BTree is a collection of persistent ZODB objects
>The ZODB objects are stored as Python Pickles in a FileStorage
> 
> We can consider what the effect of storing 60 000 objects is at each of
> these interfaces.

Are there any differences if you scale the number of objects
up to the hundreds of thousands or even into the millions?

> The Rack shouldn't have a problem with 60 000 objects.
> 
> I doubt a BTree would have a problem.
> 
> The ZODB might not like accessing many large objects during a single
> transaction, as all those objects need to be in memory at once.

Neither of my applications require batch adds to the DB,
however, one of them (the image archive) has objects
(Photos) with several images as attributes. This results in
a fairly large object. There is some question in my mind if
accessing any attribute (such as the thumbnail version)
causes all attributes to be loaded into memory. If so,
displaying a list of images with thumbnails may result in
many large objects being loaded into memory.

> A FileStorage should have no problem reading 60 000 stored objects.
> However, if these objects are changing much, your Data.fs will grow
> fast. In any case, you may find undo and history screens take a long
> time to appear.

No. Once added, I don't expect the data to change
frequently.

Thanks for the feedback.

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




Re: [Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Steve Alexander

Hi Michael,

Michael Bernstein wrote:

>
> It seems clear that indexing and searching are more of a
> botleneck than storage/retreival. Nevertheless, so far I
> have not heard of anyone trying to store more than 60,000
> objects in a rack. I need to know if there is any reason to
> suspect that storage (in the ZODB) or retreival performance
> would suffer if the number of objects was in the hundreds of
> thousands or even millions.

I can't answer your question; however, I may be able to help clarify the 
question.

The ZODB is really just a transaction manager, and an interface and 
contract of behaviour, for an object database.

You can plug a variety of Storages into the ZODB. The default storage 
the Zope comes with is FileStorage -- Data.fs.

There are also BerkeleyStorage, OracleStorage, DBMStorage, and others, 
in varying states of finishedness.

So, storing things in a Rack happens in a number of stages:

   Your application interacts with the Rack
   The Rack (perhaps) stores the object persistently in its BTree
   The BTree is a collection of persistent ZODB objects
   The ZODB objects are stored as Python Pickles in a FileStorage

We can consider what the effect of storing 60 000 objects is at each of 
these interfaces.

The Rack shouldn't have a problem with 60 000 objects.

I doubt a BTree would have a problem.

The ZODB might not like accessing many large objects during a single 
transaction, as all those objects need to be in memory at once.

A FileStorage should have no problem reading 60 000 stored objects. 
However, if these objects are changing much, your Data.fs will grow 
fast. In any case, you may find undo and history screens take a long 
time to appear.

However, if you are using a Rack, you have a lot of choice about where 
you put your data. You can put frequently changed aspects of your data 
on the filesystem, and the rest in FileStorage for example.

--
Steve Alexander
Software Engineer
Cat-Box limited
http://www.cat-box.net


___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )




[Zope-dev] Specialist/Rack scalability

2001-01-21 Thread Michael Bernstein

After comsidering the fedback I got from the previous
'Massive scalability thread, I decided to split my queries
into two areas: Rack scalability and ZCatalog scalability.
This email deals with the former.

It seems clear that indexing and searching are more of a
botleneck than storage/retreival. Nevertheless, so far I
have not heard of anyone trying to store more than 60,000
objects in a rack. I need to know if there is any reason to
suspect that storage (in the ZODB) or retreival performance
would suffer if the number of objects was in the hundreds of
thousands or even millions.

Does anyone have anectodal or benchmark data that would
suggest what happens with that many objects?

Thanks,

Michael Bernstein.

___
Zope-Dev maillist  -  [EMAIL PROTECTED]
http://lists.zope.org/mailman/listinfo/zope-dev
**  No cross posts or HTML encoding!  **
(Related lists - 
 http://lists.zope.org/mailman/listinfo/zope-announce
 http://lists.zope.org/mailman/listinfo/zope )