Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-02-17 Thread Tom De Mulder
On Wed, 17 Feb 2010, Tim Donohue (JIRA) wrote: [15:40] kshepherd DS-470 +1 to the general idea, but graham had some reasonable objections to viewing speeding up batch jobs as a priority over reducing system load I'd like to point out that this has never been substantiated, and that we have

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-28 Thread Graham Triggs
On 28 Jan 2010, at 14:04, Simon Brown wrote: Having dug through the code a little more in the meantime, it seems that the effect of pruneIndexes() is to remove from the browse indexes information about items which are expunged and/or withdrawn; in that light it might not be necessary to

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-28 Thread Richard, Joel M
Hi All, I'm still new to DSpace and all it's intricacies, so if this is a repeat of existing knowledge, forgive me. Continuing Graham's findings, I thought I would throw this out there based on my experience having managed PostgreSQL over the past several years. If you are using anything less

[Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-27 Thread Simon Brown (JIRA)
[ http://jira.dspace.org/jira/browse/DS-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=11107#action_11107 ] Simon Brown commented on DS-470: Making pruneIndexes() package private would presumably solve the

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-27 Thread Graham Triggs
2010/1/21 Tom De Mulder td...@cam.ac.uk On Wed, 20 Jan 2010, Richard Rodgers wrote: Apologies for the confusion - 'index_all' was the old name for the script: I did mean index-update. One wouldn't run index-init except in cases of new systems, corrupt indices or the like. Index-update

[Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-27 Thread Graham Triggs (JIRA)
[ http://jira.dspace.org/jira/browse/DS-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=11108#action_11108 ] Graham Triggs commented on DS-470: -- It's a little while since my head was completely in the browse

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-27 Thread Simon Brown
On 27 Jan 2010, at 11:51, Graham Triggs wrote: I'm not going to advocate a specific solution here, but a philosophy. Speed and scalability are different things, and it's dangerous to conflate the two. Happily, we weren't doing that. We were bringing to light a scalability issue by

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-27 Thread Mark Diggory
On Wed, Jan 27, 2010 at 10:16 AM, Simon Brown st...@cam.ac.uk wrote: I confess that the reason why we are having this discussion at all eludes me. It seems like a fairly obvious bug for the importer to prune the indexes so many times (the comment for pruneIndexes() even says called from the

[Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-26 Thread Graham Triggs (JIRA)
[ http://jira.dspace.org/jira/browse/DS-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=11105#action_11105 ] Graham Triggs commented on DS-470: -- I'm looking at the possibility of having the indexer determine

[Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-26 Thread Mark Diggory (JIRA)
[ http://jira.dspace.org/jira/browse/DS-470?page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanelfocusedCommentId=11106#action_11106 ] Mark Diggory commented on DS-470: - A agree with Grahams assessment here. An a related tangent, there

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-20 Thread Thornton, Susan M. (LARC-B702)[RAYTHEON TECHNICAL SERVICES COMPANY]
Hi Richard, This caught my eye this morning because we have a large repository (currently 122,091 Items). We too have issues with our imports really slowing down as our repository grows in size and have looked for a solution to the problem. I just wanted to mention that the solution

Re: [Dspace-devel] [DSJ] Commented: (DS-470) Batch import times increase drastically as repository size increases; patch to mitigate the problem

2010-01-20 Thread Richard Rodgers
Hi Sue: Apologies for the confusion - 'index_all' was the old name for the script: I did mean index-update. One wouldn't run index-init except in cases of new systems, corrupt indices or the like. Index-update operates incrementally, and is *much* faster. Richard On Jan 20, 2010, at 1:05 PM,