I am conducting a couple of performance benchmarks and currently working towards ingesting ~2million metadata-only items into a DSpace 3.0 instance. One of things I intend to do is benchmark the ingestion process and I was hoping someone could help with the following queries:
1. I am guessing the Batch Metadata Editing [1] tool is appropriate in this context as opposed to using Simple Archive Format [2]; Yes? 2. Are experiments conducted to come up with the recommended line limit size of 1k specified here [2] documented anywhere and are there factors I should consider that could have a baring on this limit? In other workds, is the 1k size a standard of some sort? 3. Are there any other documented scalability performance benchmarks other than this [3]? I'd also appreciate any other comments/suggestions. [1] https://wiki.duraspace.org/display/DSDOC3x/Batch+Metadata+Editing#BatchMetadataEditing-AddingMetadata-OnlyItems [2] https://wiki.duraspace.org/display/DSDOC3x/Importing+and+Exporting+Items+via+Simple+Archive+Format [3] http://archive.nlm.nih.gov/pubs/ceb2008/2008016.pdf -- Lighton Phiri http://lightonphiri.org ------------------------------------------------------------------------------ Master Visual Studio, SharePoint, SQL, ASP.NET, C# 2012, HTML5, CSS, MVC, Windows 8 Apps, JavaScript and much more. Keep your skills current with LearnDevNow - 3,200 step-by-step video tutorials by Microsoft MVPs and experts. ON SALE this month only -- learn more at: http://p.sf.net/sfu/learnnow-d2d _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette

