[Dspace-tech] Assetstore clean-up? Advice/reassurance/gotchas?
Hi, Running DSpace v1.6.2 (JSPUI). Our DSpace server ran out of disk space unexpectedly (as we thought we had allowed for substantial growth) and I have now determined it is because of a huge amount of orphaned content in our Assetstore. Looking at our bitstream table, there are 38,488 rows but 22,287 of them are marked as deleted=true. I have determined that this orphaned content is a result of the integration between our CRIS system (Converis) and DSpace - each time a Publication record (that has full text attached and that has already been exported to DSpace) is updated in our CRIS it updates the corresponding record in DSpace - only it does this by deleting the existing record and creating a new one with the same handle, which results in the files belonging to the deleted record lying around in the Assetstore as orphaned bitstreams. So, I've been reading up in the manual, and I note the existence of the /dspace/bin/cleanup script which I believe will resolve this problem by deleting the old rows from the bitstream table and the corresponding files in the assetstore (?). However, before I run this and potentially muck everything up, I was hoping that someone could confirm that this script will do what I'm after, and that it will be able to handle such a large cleanup? And if there are any other gotchas or advice relating to this that anyone out there can offer? I'll backup the assetstore and DB before doing anything, but any advice or reassurance would be most welcome as I'm obviously nervous about running something that could, potentially, do more harm than good :-). Cheers, Mike P.S. If there is anyone else out there with an integration between Converis and DSpace, you might also want to look into this! Michael White eLearning Liaison and Development (eLD) Information Services S8, Library University of Stirling Stirling SCOTLAND FK9 4LA Email: michael.wh...@stir.ac.uk Tel: +44 (0) 1786 466877 Fax: +44 (0) 1786 466880 http://www.stir.ac.uk/is/staff/about/teams/aldt/#eld -- The University of Stirling is ranked in the top 50 in the world in The Times Higher Education 100 Under 50 table, which ranks the world's best 100 universities under 50 years old. The University of Stirling is a charity registered in Scotland, number SC 011159. -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Assetstore clean-up? Advice/reassurance/gotchas?
We ran it on: http://etd.sun.ac.za using DSpace 1.5.2 with no problems. On 12 March 2013 15:58, Michael White michael.wh...@stir.ac.uk wrote: Hi, Running DSpace v1.6.2 (JSPUI). Our DSpace server ran out of disk space unexpectedly (as we thought we had allowed for substantial growth) and I have now determined it is because of a huge amount of orphaned content in our Assetstore. Looking at our bitstream table, there are 38,488 rows but 22,287 of them are marked as deleted=true. I have determined that this orphaned content is a result of the integration between our CRIS system (Converis) and DSpace - each time a Publication record (that has full text attached and that has already been exported to DSpace) is updated in our CRIS it updates the corresponding record in DSpace - only it does this by deleting the existing record and creating a new one with the same handle, which results in the files belonging to the deleted record lying around in the Assetstore as orphaned bitstreams. So, I've been reading up in the manual, and I note the existence of the /dspace/bin/cleanup script which I believe will resolve this problem by deleting the old rows from the bitstream table and the corresponding files in the assetstore (?). However, before I run this and potentially muck everything up, I was hoping that someone could confirm that this script will do what I'm after, and that it will be able to handle such a large cleanup? And if there are any other gotchas or advice relating to this that anyone out there can offer? I'll backup the assetstore and DB before doing anything, but any advice or reassurance would be most welcome as I'm obviously nervous about running something that could, potentially, do more harm than good :-). Cheers, Mike P.S. If there is anyone else out there with an integration between Converis and DSpace, you might also want to look into this! Michael White eLearning Liaison and Development (eLD) Information Services S8, Library University of Stirling Stirling SCOTLAND FK9 4LA Email: michael.wh...@stir.ac.uk Tel: +44 (0) 1786 466877 Fax: +44 (0) 1786 466880 http://www.stir.ac.uk/is/staff/about/teams/aldt/#eld -- The University of Stirling is ranked in the top 50 in the world in The Times Higher Education 100 Under 50 table, which ranks the world's best 100 universities under 50 years old. The University of Stirling is a charity registered in Scotland, number SC 011159. -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- *Hilton Gibson* Systems Administrator JS Gericke Library Room 1025D Stellenbosch University Private Bag X5036 Stellenbosch 7599 South Africa Tel: +27 21 808 4100 | Cell: +27 84 646 4758 http://library.sun.ac.za http://scholar.sun.ac.za http://ar1.sun.ac.za http://aj1.sun.ac.za -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Assetstore clean-up? Advice/reassurance/gotchas?
Hi Mike, yes, the cleanup script deletes bitstreams which are marked as deleted='t'. I wouldn't have any worries about running it, it works just fine. Tip: run it with the -v parameter to see which files it's currently deleting. Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette
Re: [Dspace-tech] Assetstore clean-up? Advice/reassurance/gotchas?
Many thanks helix84 (such an enigmatic nomenclature ;-) ) and Hilton for your reassuring replies, and for the -v tip . . . I now feel confident enough to give this a bash :-) Much appreciated. Mike Michael White eLearning Liaison and Development (eLD) Information Services S8, Library University of Stirling Stirling SCOTLAND FK9 4LA Email: michael.wh...@stir.ac.uk Tel: +44 (0) 1786 466877 Fax: +44 (0) 1786 466880 http://www.stir.ac.uk/is/staff/about/teams/aldt/#eld -Original Message- From: ivan.ma...@gmail.com [mailto:ivan.ma...@gmail.com] On Behalf Of helix84 Sent: 12 March 2013 14:41 To: Michael White Cc: dspace-tech@lists.sourceforge.net Subject: Re: [Dspace-tech] Assetstore clean-up? Advice/reassurance/gotchas? Hi Mike, yes, the cleanup script deletes bitstreams which are marked as deleted='t'. I wouldn't have any worries about running it, it works just fine. Tip: run it with the -v parameter to see which files it's currently deleting. Regards, ~~helix84 Compulsory reading: DSpace Mailing List Etiquette https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette -- The University of Stirling is ranked in the top 50 in the world in The Times Higher Education 100 Under 50 table, which ranks the world's best 100 universities under 50 years old. The University of Stirling is a charity registered in Scotland, number SC 011159. -- Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester Wave(TM): Endpoint Security, Q1 2013 and remains a good choice in the endpoint security space. For insight on selecting the right partner to tackle endpoint security challenges, access the full report. http://p.sf.net/sfu/symantec-dev2dev ___ DSpace-tech mailing list DSpace-tech@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-tech List Etiquette: https://wiki.duraspace.org/display/DSPACE/Mailing+List+Etiquette