Re: [Dspace-tech] DSpace memory issue

2012-02-09 Thread Tom De Mulder
On Thu, 9 Feb 2012, Gabriel Dina wrote:

 We found in our DSpace installation (XMLUI) that JAVA uses a lot of memory
 for just a few items added in DSpace.

Even in the JSPUI there are memory leaks.

We have a nightly cronjob which restarts Tomcat to address the issue, even 
though we fixed several of the memory leaks.

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 09/02/2012 : The Moon is Waning Gibbous (93% of Full)

--
Virtualization  Cloud Management Using Capacity Planning
Cloud computing makes use of virtualization - but cloud computing 
also focuses on allowing computing to be delivered as a service.
http://www.accelacomm.com/jaw/sfnl/114/51521223/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Tools for automatic creation of dublin core and contents

2011-08-10 Thread Tom De Mulder
On Wed, 10 Aug 2011, Magnus Norberg wrote:

 does anyone know if there are any tools for automatic creation of dublin 
core files and contents files?

 One need these files for batch import, one for each object. But if I 
have like a thousand files (for example PDF files) on my harddrive that I 
want to import into DSpace in a batch import, I do not want to create all 
these Item1, Item2 and so on directories one by one, and then create 
dublin core and content files one by one for each object, it would take 
too much time...

We created a tool that will do that work for you, all you need is the list 
of filenames and the metadata in a csv file, such as can be created by any 
spreadsheet program (Excel or OpenOffice, for example). It'll then create 
the batch import structure for you. This might be one way to help with 
your problem.

http://tools.dspace.cam.ac.uk/metadatamapper/


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 10/08/2011 : The Moon is Waxing Gibbous (75% of Full)

--
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Tools for automatic creation of dublin core and contents

2011-08-10 Thread Tom De Mulder
On Wed, 10 Aug 2011, Hugh Paterson III wrote:

 Tom, your extraction Method, does it take into account that the metadata 
values in the PDF (or other file) might not be correct? Does it allow for 
writing back to the file the correct values? It doesn't seem that it does 
write back to the files.

Not, it doesn't. It just makes it easier to generate the DSpace batch 
importer format.

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 10/08/2011 : The Moon is Waxing Gibbous (77% of Full)

--
uberSVN's rich system and user administration capabilities and model 
configuration take the hassle out of deploying and managing Subversion and 
the tools developers use with it. Learn more about uberSVN and get a free 
download at:  http://p.sf.net/sfu/wandisco-dev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Dspace Installation on Ubuntu

2011-08-08 Thread Tom De Mulder

On Mon, 8 Aug 2011, bonface asiligwa wrote:


I have been  trying to instaation of  dspace on ubuntu 11.04 but i dont 
succed can someone just give a step by step installation of  Dspace 7.1.2


https://wiki.duraspace.org/display/DSPACE/Installing+DSpace+1.7+on+Ubuntu

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 08/08/2011 : The Moon is Waxing Gibbous (61% of Full)--
BlackBerryreg; DevCon Americas, Oct. 18-20, San Francisco, CA
The must-attend event for mobile developers. Connect with experts. 
Get tools for creating Super Apps. See the latest technologies.
Sessions, hands-on labs, demos  much more. Register early  save!
http://p.sf.net/sfu/rim-blackberry-1___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] SSL and HTTPS Question

2011-08-01 Thread Tom De Mulder
On Mon, 1 Aug 2011, Mark H. Wood wrote:

 Should the rest of their session take place over an https connection or is
 it safe for them to go back to regular http after they have logged in?

 In general we can't really answer that and you probably can't either.
 It depends on the nature of the stuff in your repository and your
 users' needs for privacy.  And if your repo. is public, you don't know
 who your users are until they've arrived.

If you go back to HTTP after signing in, then anyone can eavesdrop and 
steal your session.

If you do not want this, then you should make sure to run everything over 
HTTPS as soon as someone's logged in. Then the rest of their session 
should be encrypted.

Assuming that the rest of the repository is public, you probably don't 
want the overhead and lack of caching of running that over HTTPS, so it's 
better to run it over plain HTTP until people log in.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 01/08/2011 : The Moon is Waxing Crescent (9% of Full)

--
Got Input?   Slashdot Needs You.
Take our quick survey online.  Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Hiding dark items

2011-07-29 Thread Tom De Mulder

We got quite a few queries recently about how we hide dark items from the 
browse and OAI-PMH views. We've picked our code apart and put the changes 
online:
http://tools.dspace.cam.ac.uk/dark_items.html

We hope this will be useful for other people in the DSpace community.


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 29/07/2011 : The Moon is Waning Crescent (14% of Full)

--
Got Input?   Slashdot Needs You.
Take our quick survey online.  Come on, we don't ask for help often.
Plus, you'll get a chance to win $100 to spend on ThinkGeek.
http://p.sf.net/sfu/slashdot-survey
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Custom thumbnail code

2011-06-17 Thread Tom De Mulder

DSpace@Cambridge uses custome thumbnail code for a variety of reasons:

* Separating these UI components from actual archival content
* Reducing database load
* Having thumbnails generated on the fly rather than waiting for the
   media filter
* Having higher-quality thumbnails than those produced by the default
   DSpace thumbnail system

Because we got several enquiries into how we did this, we made the code 
and an explanation thereof available online:
  http://tools.dspace.cam.ac.uk/thumbnails/

We hope this will turn out to be useful for people with problems similar 
to those we used to have.


--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 17/06/2011 : The Moon is Waning Gibbous (92% of Full)

--
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Embargo and OAI interface

2011-05-10 Thread Tom De Mulder
On Fri, 6 May 2011, Richard Rodgers wrote:

 The embargo system is designed to protect bitstreams, not metadata. 
While it certainly would be possible to alter OAI or other code to check 
for embargo dates, this has not been done to the best of my knowledge. I 
am curious why, given that the content will be inaccessible, is it 
desirable to hide the metadata from harvesters?

I'd like to ask for a flag in the dspace config file to let dark items be 
properly dark (including embargoed items). This applies to search results 
as well as (possibly even more so) to harvesting.

There are several instances where it might be necessary for metadata to be 
hidden:

- data protection (if the metadata contains sensitive information)
- commercial interest (e.g. novel discoveries waiting to be exploited)
- academic (e.g. disputed works)
- usability (dark items aren't available, so shouldn't show up)

We've put considerable work in filtering dark items from search results 
(which took a lot of work, and yet was still a dirty hack) and OAI. It 
would be nice to see this functionality in the main code base.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 10/05/2011 : The Moon is Waxing Crescent (44% of Full)

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Embargo and OAI interface

2011-05-10 Thread Tom De Mulder
On Tue, 10 May 2011, Blanco, Jose wrote:

 I have been working lately on hiding items from search results that have 
READ metadata restrictions for certain users.  So for example, item1 is 
restricted to only one particular user, if that user is logged-in and 
searches for a string in that item, he will get the item in the results 
set, but if an anonymous user is logged in and searches for a string in 
that item, the item will not show in the search results.  I am now trying 
to restrict items like this in the browsing, but am having more 
difficulty.  It sounds like you may have something that restricts items 
from showing up when browsing.  Is that the case?  Could you share the 
code that does that?

We do have code that does that, but it's quite an ugly hack -- it filters 
results from the browse pages (including search results) by checking 
authorization as the browse list is created. This does mess up pagination.

Sadly, our developer is indisposed at the moment, and I wouldn't know 
where to find all the changes, so sharing it isn't really possible at the 
moment. Sorry.

However, I do gather that to Do It Properly, changes would be needed to 
the actual browse system.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 10/05/2011 : The Moon is Waxing Crescent (46% of Full)

--
Achieve unprecedented app performance and reliability
What every C/C++ and Fortran developer should know.
Learn how Intel has extended the reach of its next-generation tools
to help boost performance applications - inlcuding clusters.
http://p.sf.net/sfu/intel-dev2devmay
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Scalability issues report, dsp...@cambridge

2010-10-08 Thread Tom De Mulder
On 7 Oct 2010, at 21:56, Stuart Lewis wrote:

 with 16GB of memory and fast local storage
 Java memory: -Xmx2048M -Xms2048M 
 Is there a reason why you only allocate 1/8th of the system memory to the 
 application?  Have you found that adding extra doesn't help?

In our experience, it merely delays when the error occurs, and we'd still need 
to restart. Whether we do this nightly or every other night doesn't make much 
difference. I'm not sure it would actually make it go faster. Additionally, we 
need to keep memory free for file caching and thumbnail generation; we found 
that if we assign too much memory to Java then the system needs to read from 
disk more for these other tasks and we get a slow-down there. 

 - Assetstore: random structure causes large overhead on filesystem for no 
 real gain
 Are you able to expand on the overhead that is caused, and from your 
 profiling, explain how the structure could be improved?  My gut (and 
 uniformed) instinct would be that since asset store reads are completely 
 random depending on the items being viewed at the time, the layout of 
 directories would be irrelevant.  Writes may be slightly less efficient, but 
 since writes only tend to occur once, they are of less consequence.  

Apologies for sounding cryptic; I was trying not to be too verbose in the 
template. :-) 

This has mostly to do with back-ups. With about 600,000 files in random 
directories, it can be hard to find out what files have changed. We implemented 
an simple asset store structure that stores files by year/month/day. This means 
we can mirror new files very quickly, and only traverse the entire assetstore 
every other day to check if files have changed.

Maybe I should expand a bit on our storage set-up:

- our live system has about 90TB capacity, with an EMC SAN connected to a pair 
of Sun servers. These present them to our private network at about 4Gbps, as 
well as running the checksums (I wrote some Perl to do this job locally, rather 
than add to the I/O of the live server.)

- we have two sets of back-up servers (ZFS-based) off-site for the live system, 
which use rsync to mirror all this data. (Two systems because otherwise, if we 
lose one, it'd be vulnerable too long while the data is re-sync'ed). 

A small script makes copies of the day's assetstore every hour; a complete 
rsync runs across assetstores (the original one as well as the new one with our 
own datestamp format) every alternating day, and at week-ends we run rsync with 
checksums. Essentially this system is copy-on-write: if a file changes on disk, 
the old back-up copy is moved into a holding area to be deleted when necessary, 
and the new file copied in its place.

Finally, the date structure for the directory/file names helps locate problem 
files quickly if necessary. Not a huge thing, but it makes my life easier.

 - Search indexer: fails on large repositories, slowing down and eventually 
 running out of memory.
 Do you have any percentages on the amount of page views that relate to 
 browse, and how many relate to other views?  I'm curious if browse from the 
 front end is causing an issue too?  The reason I'm asking, is that with the 
 potential inclusion of the dspace-discovery layer in a future version, this 
 could replace the database-driven browse system with solr.  Not only will 
 this provide a richer faceted search, but it could likely offer a good 
 performance boost for browse-related functions.  It also offers another way 
 of scaling-out, by putting solr on a different server.

This question I'll have to leave to Simon to answer, so I don't make a hash of 
it. 


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Scalability issues report, dsp...@cambridge

2010-10-08 Thread Tom De Mulder
Dear all,

I'm attaching a dump of our PostgreSQL configuration to this email. We got some 
input from Postgres developers into how best to tune for our needs, but if 
someone has suggestions for things to try then we'd be happy to hear them.


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH

  name   | setting  |   
   description  

-+--+---
 add_missing_from| off  | 
Automatically adds missing table references to FROM clauses.
 allow_system_table_mods | off  | 
Allows modifications of the structure of system tables.
 archive_command | (disabled)   | 
Sets the shell command that will be called to archive a WAL file.
 archive_mode| off  | 
Allows archiving of WAL files using archive_command.
 archive_timeout | 0| 
Forces a switch to the next xlog file if a new file has not been started within 
N seconds.
 array_nulls | on   | 
Enable input of NULL elements in arrays.
 authentication_timeout  | 1min | 
Sets the maximum allowed time to complete client authentication.
 autovacuum  | on   | 
Starts the autovacuum subprocess.
 autovacuum_analyze_scale_factor | 0.1  | 
Number of tuple inserts, updates or deletes prior to analyze as a fraction of 
reltuples.
 autovacuum_analyze_threshold| 50   | 
Minimum number of tuple inserts, updates or deletes prior to analyze.
 autovacuum_freeze_max_age   | 2| 
Age at which to autovacuum a table to prevent transaction ID wraparound.
 autovacuum_max_workers  | 3| 
Sets the maximum number of simultaneously running autovacuum worker processes.
 autovacuum_naptime  | 1min | 
Time to sleep between autovacuum runs.
 autovacuum_vacuum_cost_delay| 20ms | 
Vacuum cost delay in milliseconds, for autovacuum.
 autovacuum_vacuum_cost_limit| -1   | 
Vacuum cost amount available before napping, for autovacuum.
 autovacuum_vacuum_scale_factor  | 0.2  | 
Number of tuple updates or deletes prior to vacuum as a fraction of reltuples.
 autovacuum_vacuum_threshold | 50   | 
Minimum number of tuple updates or deletes prior to vacuum.
 backslash_quote | safe_encoding| 
Sets whether \' is allowed in string literals.
 bgwriter_delay  | 200ms| 
Background writer sleep time between rounds.
 bgwriter_lru_maxpages   | 100  | 
Background writer maximum number of LRU pages to flush per round.
 bgwriter_lru_multiplier | 2| 
Multiple of the average buffer usage to free per round.
 block_size  | 8192 | 
Shows the size of a disk block.
 bonjour_name|  | 
Sets the Bonjour broadcast service name.
 check_function_bodies   | on   | 
Check function bodies during CREATE FUNCTION.
 checkpoint_completion_target| 0.5  | 
Time spent flushing dirty buffers during checkpoint, as fraction of checkpoint 
interval.
 checkpoint_segments | 12   | 
Sets the maximum distance in log segments between automatic WAL checkpoints.
 checkpoint_timeout  | 5min | 
Sets the maximum time between automatic WAL checkpoints.
 checkpoint_warning  | 30s  | 
Enables warnings if checkpoint segments are filled more frequently than this.
 client_encoding | UTF8 | 
Sets the client's character set encoding.
 client_min_messages | warning  | 
Sets the message levels that are sent to the client

[Dspace-tech] Scalability issues report, dsp...@cambridge

2010-10-07 Thread Tom De Mulder
DSpace scalability issues report, per wiki template:

1. dsp...@cambridge, The University of Cambridge, UK.
   Technical contacts: Tom De Mulder, td...@cam.ac.uk (systems manager)
 Simon Brown st...@cam.ac.uk (DSpace developer)

2. a. DSpace version 1.6.2 with extensive local patches, using JSPUI
  Size: 137 communities, 258 collections, 200k items, 12TB, 436k 
bitstreams (excluding licenses)

   b. PostgreSQL 8.4.4

   c. Tomcat 6.0.24 standalone

   d. Separate servers for webapp, DB, storage and ancillary functions
  Webapp/DB servers are HT 8-core Intel servers running Ubuntu Linux
  with 16GB of memory and fast local storage
  Java memory: -Xmx2048M -Xms2048M 

3. a. - Unless Tomcat is restarted, it will consistently fail due to lack of 
memory in less than 48 hours.
  - Batch importer: will fail on large batch imports (order of thousands of 
items), performance degrades with size of repository and of batch.
  - Search indexer: fails on large repositories, slowing down and 
eventually running out of memory.
  - Assetstore: random structure causes large overhead on filesystem for no 
real gain
  
  See also our poster, presented in Gothenburg: 
http://tools.dspace.cam.ac.uk/DSUG09%20A2%20poster.pdf

   b. Installed vanilla DSpace 1.6.2, imported 200k randomly generated items, 
ran siege against it, watched it not cope.
  We've done profiling in the past, but not for 1.6. However, we've not 
noticed significant changes in the code that has issues.

   c. We have patches for the indexer; batch importer; thumbnail and PDF text 
extraction; assetstore structure; dark item masking in OAI and browse code

4. We can't commit to volunteering unless this can be incorporated into the 
work we need to undertake in our primary capacity of running the University's 
Institutional Repository. However, we would be willing to try and make this 
happen. 


--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Scalability issues report, dsp...@cambridge

2010-10-07 Thread Tom De Mulder
(Apologies for replying to my own email.)

One metric the template didn't ask for, I just noticed, is the number of hits 
per second. 

We average about 2 hits per second, which is very low, even if most of these 
hits are actual page views, not just layout elements. However, both our webapp 
and database servers are under constant load, the latter in particular.

Actual load average numbers are meaningless for comparison because they depend 
so much on the way the OS kernel implements them, so I won't give them. Suffice 
to say, though, that we had to ask the people running our university search 
engine and similar services to throttle their index rate so the servers 
wouldn't get overloaded.

Also of note is that the problems are mostly on the database and webapp end, 
there are no problems with I/O (disk or network).


--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-10-06 Thread Tom De Mulder
On 6 Oct 2010, at 15:15, Graham Triggs wrote:

[snip]

This is exactly the kind of pointless pontification that we got last time.

Any point that is raised is deflected or ignored, and you even manage to 
contradict yourself between paragraphs. What's it to be, should patches benefit 
ALL repositories, or is it fine if it's just some? Or the other way round, 
maybe?


I will be very happy to offer our experiences regarding large-scale DSpace 
instances with the community, if that can be of any help. But not if it 
involves having to deal with Graham Triggs.


I really do not have time for this.


--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Beautiful is writing same markup. Internet Explorer 9 supports
standards for HTML5, CSS3, SVG 1.1,  ECMAScript5, and DOM L2  L3.
Spend less time writing and  rewriting code and more time creating great
experiences on the web. Be a part of the beta today.
http://p.sf.net/sfu/beautyoftheweb
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-29 Thread Tom De Mulder
On 24 Sep 2010, at 21:17, bill.ander...@library.gatech.edu wrote:

 We've been experiencing problems similar to some reported on this thread 
 since our
 upgrade to 1.6 several months ago.  We're still using the jspui, and we've 
 wondered 
 (among other things) if some of these problems might be alleviated by a 
 switch to
 the xmlui.  Has anybody had any experience comparing the memory footprint 
 and/or resource
 usage issues between the two interfaces?

We load-tested the XMLUI (on identical hardware) and it was even worse. It ran 
out of memory and crashed really quickly, so we never took it into production. 
But your mileage may vary.


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-29 Thread Tom De Mulder
On 29 Sep 2010, at 11:38, Hilton Gibson wrote:

 We started with a VM which had 2GB memory.
 Then added 2GB to the VM, no luck.
 Then luckily we had funds to buy a server.
 So now we have 12GB RAM and 12CPU's. No crashes so far.
 Using the XMLUI.
 Does DSpace really need this and what happens when we go to one million items 
 ??

A lot of the back-end code of DSpace, the very core of it, is inherently 
inefficient. Several tasks are executed more than once, and entire objects are 
created when only one attribute is needed, etc. (I'd be more specific, but I'm 
not a specialist on this matter, and our resident DSpace developer is on leave 
this week.)


I am really glad to hear from other people with problems similar to ours.


--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-29 Thread Tom De Mulder
On 29 Sep 2010, at 11:47, Mark Ehle wrote:

 Why was tomcat chosen as a platform for DSpace?

It wasn't. You can use any Servlet engine. We used JBoss for a while but went 
back to Tomcat because it fitted into our infrastructure better.

I believe DSpace was written in Java because Rob Tansley wanted to try writing 
a project in Java, but I could be wrong. :)


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-29 Thread Tom De Mulder
On 29 Sep 2010, at 13:03, Graham Triggs wrote:
 
 Some of those repositories have 1000s of items, and get quite decent levels
 of access.
 

Thousands?

I don't even want to have this discussion until you're talking hundreds of 
thousands, and how many hits per second. I know you like to talk down the 
problem, but that really isn't helping.

We run 5 DSpace instances, three of these are systems with hundreds of 
thousands of items, and it's dog slow and immensely resource-intensive. And 
yes, we want these to be single systems. Why shouldn't we?

We have other systems here at the University that are much bigger, do similar 
things and require far, far less in terms of resources.

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-23 Thread Tom De Mulder
On 22 Sep 2010, at 20:22, Sands Alden Fish wrote:

 (2) We currently don't have a centralized server with enough test data
 to run many of these memory or scalability tests on our own.  I think
 this is something we could look into improving upon (especially if
 anyone has test data to donate to the cause).

There is a lot of public domain data available online. I spent some time 
collecting some of this in a variety of formats (text, images, movies, sound, 
datasets) and then wrote something to use a word list (e.g. /usr/share/dict on 
most Linux systems) to create random metadata for them. 

After all, it doesn't matter that many bitstreams will be identical.

That is how we populated our test environment here so we could replicate the 
problems we were seeing on the live system.


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-22 Thread Tom De Mulder

I am very happy to see that this issue seems finally to be taken seriously. 
However, I find myself getting a bit frustrated that it was never taken 
seriously when I raised it in the past.

I think the DSpace source code carries with it a lot of historical baggage, and 
it could do with being addressed even without making fundamental changes to the 
basic architecture. Although my personal favourite would be a completely new 
architecture with more loosely coupled modules, but fixing memory leaks and the 
associated slow performance would be a good start.

I can add that, for example, deleting a collection with 1200 items on our 
rather powerful DSpace machines will take two hours, and uses most of the 
available memory. You can see why I would like that no longer to be the case.


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH


--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] tomcat reporting memory leak?

2010-09-20 Thread Tom De Mulder
On Mon, 20 Sep 2010, Damian Marinaccio wrote:

 I'm seeing the following log messages in catalina.out:
 [...]
 SEVERE: The web application [] appears to have started a thread named 
 [FinalizableReferenceQueue] but has failed to stop it.
 This is very likely to create a memory leak.

There are quite a few memory leaks in DSpace. We have a cronjob to restart 
Tomcat nightly, because otherwise it'll break the next day.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 20/09/2010 : The Moon is Waxing Gibbous (80% of Full)

--
Start uncovering the many advantages of virtual appliances
and start using them to simplify application deployment and
accelerate your shift to cloud computing.
http://p.sf.net/sfu/novell-sfdev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Meatadata dates stored as UTC

2010-06-25 Thread Tom De Mulder
On Fri, 25 Jun 2010, TAYLOR Robin wrote:

 Dates held in the metadatavalues table are converted from their local 
time zone to UTC before being stored in the database. The problem is that 
they are not generally converted back to their local time zone before 
being displayed (see Jira http://jira.dspace.org/jira/browse/DS-568). 
This is misleading to the user. You could conceiveably see that you had 
submitted an item whilst you were still asleep in bed. I'm not sure what 
to do about this. It would be messy to always check for a metadatavalue 
being a date before displaying it. What would be the consequences of not 
storing dates as UTC ? Could we store them with a time zone eg 22:30+04 
? This might be a little less confusing. I'm sure there are good reasons 
for storing dates as UTC I just don't know what they are, can anyone help 
?

They're stored in Zulu time, which has the advantage of not being 
dependent on time zones or daylight savings.

The best thing to do is to store them in this timezone, but to convert 
them on display to the local time.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 25/06/2010 : The Moon is Waxing Gibbous (91% of Full)

--
ThinkGeek and WIRED's GeekDad team up for the Ultimate 
GeekDad Father's Day Giveaway. ONE MASSIVE PRIZE to the 
lucky parental unit.  See the prize list and enter to win: 
http://p.sf.net/sfu/thinkgeek-promo
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Bad robot! Googlebot and Internal Server Errors

2010-02-11 Thread Tom De Mulder
On Thu, 11 Feb 2010, Michael White wrote:

:session_id=9E40BFD899A2AA5C23E81404AF5B97A5:internal_error:-- URL Was: 
https://dspace.stir.ac.uk/dspace/browse-title?bottom=1893/214
[snip]
 
 User-agent: *

 Disallow: /browse-author
 Disallow: /items-by-author
 Disallow: /browse-date
 Disallow: /browse-subject
 

You should add /dspace to the start of those disallowed patterns, 
because your DSpace URLs start with /dspace after the hostname.

The standard (or rather, consensus) has this to say about disallow 
fields in robot.txt:
The value of this field specifies a partial URL that is not to be 
visited. This can be a full path, or a partial path; any URL that starts 
with this value will not be retrieved.

Note the starts with.

See also: http://www.robotstxt.org/


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 11/02/2010 : The Moon is Waning Crescent (19% of Full)

--
SOLARIS 10 is the OS for Data Centers - provides features such as DTrace,
Predictive Self Healing and Award Winning ZFS. Get Solaris 10 NOW
http://p.sf.net/sfu/solaris-dev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Dspace and https

2010-02-09 Thread Tom De Mulder
On Tue, 9 Feb 2010, Fabien COMBERNOUS wrote:

 I installed a Dspace from trunk checkout. All is well running. Now i
 want to setup an https access to my dspace repository.

 With tomcat6 it looks necessary to use SSLEnabled=true in the
 connector about port 8443. Now i have the following error about ssl config :
 09-Feb-2010 11:00:03 org.apache.coyote.http11.Http11Protocol start
 SEVERE: Error starting endpoint
 java.io.IOException: jsse.invalid_ssl_conf
 ...
 Caused by: javax.net.ssl.SSLException: No available certificate or key
 corresponds to the SSL cipher suites which are enabled.

Have you tried following the SSL Howto? It may address your problem:
http://tomcat.apache.org/tomcat-6.0-doc/ssl-howto.html


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 09/02/2010 : The Moon is Waning Crescent (33% of Full)

--
The Planet: dedicated and managed hosting, cloud storage, colocation
Stay online with enterprise data centers and the best network in the business
Choose flexible plans and management services without long-term contracts
Personal 24x7 support from experience hosting pros just a phone call away.
http://p.sf.net/sfu/theplanet-com
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] DSpace batch ingest scalability and performance

2010-01-19 Thread Tom De Mulder
Hello all.

We (the DSpace team at the University of Cambridge) are currently holding 
our own mini-1.6-testathon. Our particular interest lies with scalability, 
because it has caused us trouble in the past.

If this interests anyone, I'm trying to write up our tests, notes, 
conclusions etc on a blog I set up for this purpose. The first figures, 
for importing about 100,000 items in batches, can be seen here:
http://tdm27.wordpress.com/2010/01/19/dspace-1-6-scalability-testing/

I figured that this would be better than trying to write it up here on the 
mailinglist.


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 19/01/2010 : The Moon is Waxing Crescent (24% of Full)

--
Throughout its 18-year history, RSA Conference consistently attracts the
world's best and brightest in the field, creating opportunities for Conference
attendees to learn about information security's most important issues through
interactions with peers, luminaries and emerging and established companies.
http://p.sf.net/sfu/rsaconf-dev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Zotero

2009-12-09 Thread Tom De Mulder
Hi all,

as some/many people will know, Zotero used to work with DSpace 1.4, but no 
longer works with 1.5 and higher. This isn't DSpace's fault -- Zotero is 
merely being too eager to invoke the wrong translator when it recognises a 
DSpace site.

If other people think this is important (and Zotero is certainly seeing 
more and more use), could they please add their comments to the forum 
thread on the Zotero forum, to push their developers to make Zotero work 
again with current versions of DSpace?

http://forums.zotero.org/discussion/7009/dspace-translators-not-longer-valid/


Thanks,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
- 09/12/2009 : The Moon is Waning Gibbous (52% of Full)

--
Return on Information:
Google Enterprise Search pays you back
Get the facts.
http://p.sf.net/sfu/google-dev2dev
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] Batch importer spreadsheet metadata mapping tool

2009-09-18 Thread Tom De Mulder
To whom it may be of interest:

we recently had cause to develop a tool for internal use, to generate a 
DSpace (1.5.x) batch importer structure from a spreadsheet an associated 
files. This has helped facilitate batch deposit by people who otherwise 
would have lacked the technical prowess to generate the correct importer 
structure. Now, instead, they can produce a spreadsheet that describes the 
items they want to deposit.

Given how popular this tool has turned out to be, we decided to share it 
with the DSpace community, in case it might prove useful:

http://tools.dspace.cam.ac.uk/metadatamapper/


Best regards,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 18/09/2009 : The Moon is Waning Crescent (7% of Full)

--
Come build with us! The BlackBerryreg; Developer Conference in SF, CA
is the only developer event you need to attend this year. Jumpstart your
developing skills, take BlackBerry mobile applications to market and stay 
ahead of the curve. Join us from November 9#45;12, 2009. Register now#33;
http://p.sf.net/sfu/devconf
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Why the DSpace checksum checker?

2009-04-17 Thread Tom De Mulder
On Fri, 17 Apr 2009, Mark Diggory wrote:

 I've never been impressed with the reasoning behind this addition to
 DSpace, it mistakes bitstream security and file corruption as
 something that should be tracked by the DSpace application. We

I agree, but with one caveat:

 A real file integrity system should be implemented outside of the
 application by an experienced system administrator vested in
 maintaining the security and integrity of the system, not in the
 application by a webapplication developer.  I do value and respect the

It is important to make sure that the file the web application put on disk 
is the same as the one still there. While various monitoring tools can 
check if files have changed on disk, at least at one stage should there be 
a verification of what the archive thinks the file's checksum is, and 
what's on disk.

However, this is easily done outside the webapp.

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 17/04/2009 : The Moon is Waning Gibbous (54% of Full)

--
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Performance issues with bitstream checker

2009-04-16 Thread Tom De Mulder
On Tue, 14 Apr 2009, Ruijgrok, P.T. (Peter) wrote:

 I had serious performance problems with the bitstream checker, running
 Dspace 1.4.x
 We have +320.000 bitstreams and increasing continously.

Sadly, the current DSpace codebase has some serious scalability issues. 
(And Java's MD5 implementation isn't the fastest, either, but that's not 
the main culprit.)

For our instance, which has a separate server hosting the filesystem 
(which itself resides on a SAN), I wrote a Perl script to do the 
checksumming. It runs continuously in a loop, and manages nearly 500,000
bitstreams in 6 to 10 hours, depending on the load on the fileserver. It 
uses the md5sum binary from solarisfreeware.com.

It puts almost no load on the database, because it only queries the 
checksums from the bitstream table once, at the start. Output is logged 
continuously, and our local Nagios server monitors for any checksum 
errors.

This also has the advantage that it doesn't load the (Tomcat) webapp box, 
which already has enough work to do.

It also means that the same script can run on our backup servers (which 
also use disk; we couldn't manage with tape).

We've taken this approach with other things as well, such as our 
thumbnails (which aren't using the DSpace code, because we wanted to 
separate something as user-interface-centric as that from the actual 
archive contents; the DSpace code was also just too slow and just crashed 
our server).


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 16/04/2009 : The Moon is Waning Gibbous (59% of Full)

--
Stay on top of everything new and different, both inside and 
around Java (TM) technology - register by April 22, and save
$200 on the JavaOne (SM) conference, June 2-5, 2009, San Francisco.
300 plus technical and hands-on sessions. Register today. 
Use priority code J9JMT32. http://p.sf.net/sfu/p
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Backup procedure

2009-02-20 Thread Tom De Mulder
On Fri, 20 Feb 2009, West, Jeff wrote:

 I would also be interested in an answer to this question.  We currently 
run Fedora 10 Linux.  We haven't populated anything, because we want 
clear way to backup and restore in the event of a server crash.

Are you running PostgreSQL?

In which case the pg_dump command is all you need. Run it on a regular 
schedule with a user with sufficient access privileges, eg.

  pg_dump --format t yourdspacedbname  databasedump.2009-02-20.sql

That gives you a DB dump you can just slurp back in in future and will 
be enough in almost all cases.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 20/02/2009 : The Moon is Waning Crescent (38% of Full)

--
Open Source Business Conference (OSBC), March 24-25, 2009, San Francisco, CA
-OSBC tackles the biggest issue in open source: Open Sourcing the Enterprise
-Strategies to boost innovation and cut costs with open source participation
-Receive a $600 discount off the registration fee with the source code: SFAD
http://p.sf.net/sfu/XcvMzF8H
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Speed problem in postgres during batch ingesting

2009-01-27 Thread Tom De Mulder
On Tue, 27 Jan 2009, Stuart Lewis wrote:

 The following paper talks about this, and how DSpace performs when ingesting
 1 million items:

 Testing the Scalability of a DSpace-based Archive, Dharitri Misra, James
 Seamans, George R. Thoma, National Library of Medicine, Bethesda, Maryland,
 USA

 http://www.dspace.org/images/stories/ist2008_paper_submitted1.pdf

 Is this one big import of 30,000 items, or do you break them up into smaller
 chunks?

That paper doesn't use the DSpace importer, so I fail to see how it can 
claim the importer scales well.

I can tell from a lot of first-hand experience that the DSpace importer 
doesn't scale, and that it gets slower as you have more items in your 
DSpace instance, as well as slowing down for each item in the batch.

In addition, if you have a busy DSpace instance, there may be issues with 
file locking where deleted filehandles don't get recovered properly.


best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 27/01/2009 : The Moon is Waning Crescent (3% of Full)

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Google bots and web crawlers

2009-01-14 Thread Tom De Mulder
On Wed, 14 Jan 2009, Shane Beers wrote:

 We had an issue with our local google instance crawling our DSpace 
 installation and causing huge issues. I re-wrote the robots.txt to disallow 
 anything besides the item pages themselves - no browsing pages or search 
 pages 
 and whatnot. Here is a copy of ours:

We've had to do that for years; without it DSpace just crumbles under the 
load. I've got a small Perl script which generates a flat html file with 
links to all our item pages, and we put a link to that in the footer.

So we can block all browse pages, but not item or bitstreams, and still 
get indexed.

DSpace 1.x has major scalability issues, alas. No matter how much hardware 
you throw at it.


Best,

--
Tom De Mulder td...@cam.ac.uk - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 14/01/2009 : The Moon is Waning Gibbous (83% of Full)

--
This SF.net email is sponsored by:
SourcForge Community
SourceForge wants to tell your story.
http://p.sf.net/sfu/sf-spreadtheword
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] Adding and removing bitstreams

2008-09-23 Thread Tom De Mulder
On Tue, 23 Sep 2008, Hlias Stavrakis wrote:

 Hi, i face a problem on adding and removing bitstreams in both dspace
 1.4 and 1.5 and would like to ask the community and the
 developers of dspace for it.

It's essentially broken. Like most of the authorization system. Some of 
our users are really fed up with this, but it's such a mess to sort out 
properly.

--
Tom De Mulder [EMAIL PROTECTED] - Cambridge University Computing Service
- 23/09/2008 : The Moon is Waning Gibbous (53% of Full)

-
This SF.Net email is sponsored by the Moblin Your Move Developer's challenge
Build the coolest Linux based applications with Moblin SDK  win great prizes
Grand prize is a trip for two to an Open Source event anywhere in the world
http://moblin-contest.org/redirect.php?banner_id=100url=/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


[Dspace-tech] DSpace 1.5 beta 1

2008-02-15 Thread Tom De Mulder

Could any of the more involved developers tell me why the database schema 
for DSpace 1.5 still has admin and submitter columns in the collection 
table, when there is a ResourcePolicy table? In our experience, if the 
former and latter disagree with each other, serious authz problems occur; 
it would be better if everything used the ResourcePolicy rather than the 
columns on the collection table.

Any reason why they can't be dropped for this release?


Best,

--
Tom De Mulder [EMAIL PROTECTED] - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 15/02/2008 : The Moon is Waxing Gibbous (58% of Full)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech


Re: [Dspace-tech] DSpace 1.5 beta 1

2008-02-15 Thread Tom De Mulder
On Fri, 15 Feb 2008, Scott Phillips wrote:

 system, but that's been discussed before. To answer you're question these 
 columns are still needed because that is where DSpace determines who is 
 allowed to submit or administrate a collection, and yes those epersons must 
 also be granted the basic resource policies over those objects as well - so 
 its best to avoid situations where they are out of sync. We are way too far 
 along in this release to consider a database schema change of this magnitude.

Right. I was under the impression that, given the add/admin right in the 
resourcepolicy table, we could just use those. For us, here, both those 
columns are empty, for example. We've got a patch ready to roll out to 
hide the UI elements that populate them, in the hope that that'll stop 
them getting out of sync.

I've only skimmed most of the talk about the architectural review, just 
being too busy to deal with the stream of emergencies at a local level.

We'll definitely be working on the authn/authz system in the very near 
future, which will probably take us down the route of having an ACL 
implementation that can cope with Shibboleth and our local single signon 
system... I was just hoping that 1.5 would get us started further along 
that route. :-)


Thanks for the response,

--
Tom De Mulder [EMAIL PROTECTED] - Cambridge University Computing Service
+44 1223 3 31843 - New Museums Site, Pembroke Street, Cambridge CB2 3QH
- 15/02/2008 : The Moon is Waxing Gibbous (59% of Full)

-
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse012070mrt/direct/01/
___
DSpace-tech mailing list
DSpace-tech@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/dspace-tech