Hi Simon & All,

On 10/5/2010 10:33 AM, Simon Brown wrote:
>
> On 4 Oct 2010, at 15:00, Graham Triggs wrote:
>
>> On 29 September 2010 14:17, Tom De Mulder<[email protected]>  wrote:
>> I know you like to talk down the problem, but that really isn't
>> helping.
>>
>> This isn't about talking down the problem - it's about finding where
>> the real problems are and not just patching the immediate concerns.
>> And considering the interests of nearly 1000 DSpace instances that
>> are registered on dspace.org - many of whom will probably be more
>> worried about rampant resource usage for small repositories from
>> adding overhead to cover up the problems of larger repositories.
>
> Which nobody has requested, making this a massive red herring. I fail
> to see how cutting back on unnecessary and redundant database access
> constitutes "overhead to cover up the problems of larger
> repositories". Any repository, regardless of size, will see
> improvements with this kind of optimisation, at least one example of
> which I have already highlighted (and had my arguments shouted down -
> this is also, incidentally, why I haven't bothered to open any other
> JIRA tickets on other performance issues we've seen. What would be the
> point?)

It's really unfortunate that you've experienced this and/or felt this 
way in the past.  Perhaps we haven't been able to tease out the problems 
at hand as well as we could have, and I hope we can improve upon that now.

However, I'd highly recommend freely adding specific issues to our JIRA 
-- it will *guarantee* that the DSpace committers will review & discuss 
them (each week, we set aside time in our weekly meeting to do so -- see 
https://wiki.duraspace.org/display/DSPACE/Developer+Meetings ).  When 
adding JIRA issues, specifics are best, that way we can narrow down 
where the problem may reside.

The longer these specific issues remain outside of JIRA, the more likely 
they will be accidentally overlooked in future versions of DSpace (as 
JIRA is our primary means of scheduling things to be fixed in new 
versions).  We really do mean well, and we'd like to work with you to 
resolve these issues.  We're not trying to continually throw up "red 
herrings" to avoid problems -- it's really a matter of attempting to 
better understand where the specific issue resides.

As volunteer developers, each of the DSpace Committers all only have a 
limited amount of time to work on DSpace in a given week. Therefore, the 
more information you can provide us with, the better. If you know of 
specific areas where there are redundant database accesses, we'd 
appreciate it if you could point them out to us (or enter a JIRA issue 
and we'll fix it).  We want to resolve these issues, but sometimes we 
don't have enough time in our normal work week to dig in deep enough to 
locate them.  We highly encourage sites who have stumbled across 
problems in the code to report them -- that way we can look at that 
specific area of the code and fix it so that it is no longer an issue.

> Leaving aside any theoretical ideal futures for the moment, it seems
> to me that the gist of this conversation is "DSpace does not support
> single-instance repositories over a certain size". That being the
> case, I think it would be only fair to make that lack of support
> explicit in the documentation and PR materials for the software, in
> order that all of the relevant information is readily available for
> anyone making decisions about the future of their repository.

I'd say we want to support single-instance repositories of larger sizes 
as well.  There will always be a size limit where it makes more sense to 
scale across multiple nodes, but we should be working to increase that 
size limit as much as we can (within reason, obviously).  Although it 
isn't yet explicit in our RoadMap, I think we also want to work towards 
allowing DSpace to scale across multiple nodes (where it makes sense to).

Again, the best way for us to improve your immediate DSpace performance 
is to better understand the exact problems you've already noticed.  We 
can only fix issues that we know about, and sometimes discovering where 
the issue resides can be the hardest part. If you've already discovered 
very specific issue(s), we'd appreciate it if you can share them.  If 
you haven't yet discovered the exact issue(s), we may be able to help 
narrow down the problem if you can share which parts of your DSpace seem 
'especially sluggish', etc.

The end result is that we really should be working together on a 
resolution for the present, rather than continually arguing over ideal 
futures or past discussions. Open source development works best if we 
can all share information/ideas/issues/resolutions freely and openly. 
Yes, that also means sometimes arguing openly -- which is perfectly OK 
by me, as sometimes arguments bring us all to a better solution or route 
forward. But, I do want to encourage us all to keep things constructive, 
so that we can move DSpace software forward to the benefit of us all.

It's also worth mentioning Graham is already volunteering some of his 
time to start digging in deeper to try and discover where some memory 
issues may already reside in DSpace 1.6, no matter what size a 
repository is.  Just today, he's started a separate, technical thread 
that may be of interest: 
http://www.mail-archive.com/[email protected]/msg12161.html 


Hopefully, as this investigation moves forward, we can all work together 
to find ways to improve DSpace performance both in the short term and 
longer term.

- Tim





------------------------------------------------------------------------------
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2 & L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to