On 5 November 2010 19:55, Graham Triggs <grahamtri...@gmail.com> wrote:
> There have been a few improvements in DSpace 1.7 recently. I just ran a
> test on my MacBook Pro. My local repository started with an existing 94072
> items already installed.
>
> Running the ItemImport command, over a period of 5 minutes, I was able to
> consistently observe ingest rates of between 8 and 12 items per second
> (minute intervals of 94722, 95060, 95550, 96249 and 96864 items installed).
> This is using Postgres based browse tables and a Lucene search index.
>
> Note that these were metadata only items, although not entirely random - if
> you take a look in DSpace trunk, I've added into an org.dspace.testing
> package a PubmedToImport class - which will use a SAX parser to spit out
> DSpace import format directories from a medline.xml file (you can easily
> generate a large file consisting of many thousands of items from
> http://www.ncbi.nlm.nih.gov). It's very rough around the edges, and it's
> not a complete mapping of the data, but it provides a decent amount of
> reasonably 'real world' test data very quickly.
>
>
I forgot to add - that is not an ItemImport specific hack. This kind of
performance applies to web form submissions, item edits, deletions, and
SWORD deposits (although for anyone expecting to fire items at the SWORD
server at that rate ought to bear in mind that the packaging format is going
to add overhead to the overall processing).
G
------------------------------------------------------------------------------
The Next 800 Companies to Lead America's Growth: New Video Whitepaper
David G. Thomson, author of the best-selling book "Blueprint to a
Billion" shares his insights and actions to help propel your
business during the next growth cycle. Listen Now!
http://p.sf.net/sfu/SAP-dev2dev
_______________________________________________
Dspace-devel mailing list
Dspace-devel@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-devel