Mark and all: Even if the proposed patch doesn't fit in with the current architecture of the system, I think it would be useful to make a binary easily available with the fast import code.
Graham made some excellent points yesterday evening. I'm paraphrasing and may have muddled this a bit, but: - Just because a system has been made faster in one area doesn't mean it's now scalable - A gigantic system may break or become unusable in other areas and need other adjustments - for example, search indexes may need to be sharded. Making the fast import tool available, at least as an option, would give organizations one means of quickly loading large amounts of their data into test systems so that they can start to poke at prototypes of gigantic systems and see where they might break. I know that there are people with data collection, testing, and research skills at organizations that have access to large amounts of data, and experience with the DSpace system, who could justify spending staff resources on identifying the scalability issues if they could show a gigantic system now. This fast import tool would help them produce the giant test system. Can the fast importer be made readily available somewhere as an aid to identifying and testing scalability issues in the current and future versions of DSpace? thanks, keith ----- Original Message ----- From: "Mark Diggory" <mdigg...@atmire.com> To: "Simon Brown" <st...@cam.ac.uk>, dspace-devel@lists.sourceforge.net Sent: Wednesday, January 27, 2010 6:32:48 PM GMT -05:00 US/Canada Eastern Subject: Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem We discuss it because we seek to maintain an appropriate separation of concerns in our architecture. And because Graham usually challenges us to look at aspects of that architecture that are important. What is under discussion is not that performance can't be improved by your patch, you've identified a very important issue in batch processing. We are discussing architecturally if we want to alter the Context/EventManager framework and expose calls to pruneIndex. We want to be careful to avoid exposing too much of the internals of the Browse system outside in the application architecture. Excellent work on finding a means to improve DSpace performance. Cheers, Mark -- Mark R. Diggory Head of U.S. Operations - @mire ------------------------------------------------------------------------------ The Planet: dedicated and managed hosting, cloud storage, colocation Stay online with enterprise data centers and the best network in the business Choose flexible plans and management services without long-term contracts Personal 24x7 support from experience hosting pros just a phone call away. http://p.sf.net/sfu/theplanet-com _______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel