Hi Pan, The Web server aspect (i.e. Tomcat) should have fairly constant memory use -- the vast majority of operations are very short and work on a very small number of objects, and as soon as the request is over any memory used is returned to the heap. How much memory you need to give it largely depends on the load, i.e. how many of these the server will be servicing at a given instant.
The areas I think folks have run into memory use issues are batch importing, indexing and the media filters (thumbnail generation, text extraction for indexing) -- these operate on a large number of objects at once, and some of the DSpace code isn't so great at freeing up objects in these operations. But we're finding the problems and fixing them as Cory mentions. Getting technical below: Developers: a quick scan of the code shows that: batch export (classic): needs fixing batch import (classic): needs fixing browse indexer: needs fixing search (lucene indexer): needs fixing media filter: OK history system: problems recording collection state (loads all items into memory) Sitemap generator: OK checksum checker: fine but only because it has its own DB access routines and doesn't use the APIs (!) The new-style packager (with plug-ins) only appears to be able to operate on one Item at a time. Also found: BitstreamStorageManager appears to reach up into busines logic layer and user checker API (!!!!) this needs fixing. This is probably because the checksum checker includes its own DB access API :-O The above could probably be fixed for 1.4.2, with the potential exception of the checksum checker which needs to be changed to use the correct APIs. Rob On 18/04/07, Pan Family <[EMAIL PROTECTED]> wrote: > Thank you all for giving your opinion! > > Technically, is it the web application or the indexer that requires > most of the memory? What data is kept in memory all the time > (even when nobody is searching)? Is the memory usage proportional > to the number of concurrent sessions? > > Thanks again, > > Pan > > > > > > On 4/18/07, Cory Snavely <[EMAIL PROTECTED]> wrote: > > Well, as I said at first, it all depends on your definition of what a > > memory hog is. Today's hog fits in tomorrow's pocket. We better all > > already be used to that. > > > > Also, I don't think for a *minute* that the original developers of > > DSpace made a casual choice about their development environment--in > > fact, I think they made a responsible choice given the alternatives. > > Let's give our colleagues credit that's due. Their choice permits > > scaling and fits well for an open-source project. Putting the general > > problem of memory bloat in their laps seems pretty angsty to me. > > > > Lastly, dedicating a server to DSpace is a choice, not a necessity. We > > as implementors have complete freedom to separate out the database and > > storage tiers, and mechanisms exist for scaling Tomcat horizontally as > > well. In the other direction, I suspect people are running DSpace on > > VMware or xen virtual machines, too. > > > > Cory Snavely > > University of Michigan Library IT Core Services > > > > On Wed, 2007-04-18 at 13:40 -0500, Brad Teale wrote: > > > Pan, > > > > > > Dspace is a memory hog considering the functionality the application > > > provides. This is mainly due to the technological choices made by the > > > founders of the Dspace project, and not the functional requirements the > > > Dspace project fulfills. > > > > > > Application and memory bloat are pervasive in the IT industry. Each > > > individual organization should look at their requirements whether they > > > are hardware, software or both. Having to dedicate a machine to an > > > application, especially a relatively simple application like Dspace, is > > > wasteful for hardware resources and people resources. > > > > > > Web applications should _not_ need 2G of memory to "run comfortably". > > > > > > > > > > ------------------------------------------------------------------------- > This SF.net email is sponsored by DB2 Express > Download DB2 Express C - the FREE version of DB2 express and take > control of your XML. No limits. Just data. Click to get it now. > http://sourceforge.net/powerbar/db2/ > _______________________________________________ > DSpace-tech mailing list > [email protected] > https://lists.sourceforge.net/lists/listinfo/dspace-tech > > ------------------------------------------------------------------------- This SF.net email is sponsored by DB2 Express Download DB2 Express C - the FREE version of DB2 express and take control of your XML. No limits. Just data. Click to get it now. http://sourceforge.net/powerbar/db2/ _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

