Hi Stuart, I will explain better my scenario:
Today we have approx. 400.000 items in the repository Our DSpace instance is populated the following way: Consider three PDF files per day, with a variable number os pages (100 ~ 5000) each. We create the hierarchy Year (community) -> Month (Community) -> Day (Community) -> File (collection) for each file. For instance, 2008 -> Aug -> 13 -> My file 1; 2008 -> Aug -> 13 -> My file 2; 2008 -> Aug -> 13 -> My file 3. We upload each file page (a PDF of approx. 100Kb) as a different item. The repository is write/read only, no updates are made in submited items. What we need is to index only the files that have been just uploaded. In shor, we have to index hundreds of small items per day. The server configuration is: CPU: Intel(R) Xeon(TM) CPU 3.00GHz (2992.52-MHz K8-class CPU) FreeBSD/SMP: Multiprocessor System Detected: 2 CPUs real memory = 2147483648 (2048 MB) avail memory = 2056413184 (1961 MB) Postgres db running in the same machine Disk (for DSpace and database) is a sorage mounted with nfs Regards, Marcelo Gomes On Fri, Aug 13, 2010 at 1:15 AM, Stuart Lewis <[email protected]>wrote: > Hi Marcelo, > > > Every day, I import hundreds pdfs to my DSpace. I developed the small > program to help this task. After this, I need to update my index. > > > > I run filter-media.sh whith parameter -s to disconsider the previous top > communities. But the index-update is very slow, spend +- 2 hours. How can I > optimize this task? > > It's hard to know without knowing a bit more detail. For example: > > - How big are the PDFs > - What is your server setup like? (1 server, or DSpace on a different > server to your database) > - How fast are your disks > > Really you'll need to do a bit of work to see where the bottlenecks are. It > maybe that your disks are slow, or you're running out of RAM (databases > perform better with lots of RAM), or the PDFs are very big so will take a > while, your processors are at capacity etc. > > If you can answer some of these questions, we may be able to help you tune > your setup, of suggest ways of improving the process. > > Thanks, and good luck, > > > Stuart Lewis > IT Innovations Analyst and Developer > Te Tumu Herenga The University of Auckland Library > Auckland Mail Centre, Private Bag 92019, Auckland 1142, New Zealand > Ph: +64 (0)9 373 7599 x81928 > >
------------------------------------------------------------------------------ This SF.net email is sponsored by Make an app they can't live without Enter the BlackBerry Developer Challenge http://p.sf.net/sfu/RIM-dev2dev
_______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

