Good afternoon, If there are files in all the right places, it might be less effort to export the entire repo and then import into a fresh instance. Maybe. I know you'uns have a lot of stuff.
B-- >>> On 10/26/2012 at 11:58 AM, in message <39919fc910d0004bbec8514947c345759b734f1...@ndjsscc07.ndc.nasa.gov>, "Thornton, Susan M. (LARC-B702)[LITES]" <[email protected]> wrote: > What I'm in the process of doing to validate the integrity of the restore is > this: > > 1. List all files of type -f in the /assetstore1 directory, then load it > into a temporary table with 3 columns in Postgres: > > a. Record_id (unique identifier) > > b. sub_directory (contains the 1st 6 digits of all the file names > > c. file_name (contains the entire file name > > 2. Execute a sql query that looks at all rows in the bitstream table > where store_number = 1 and see if the first 6 digits match the subdirectory > the file was found in, on assetstore1 and the file name matches the > internal_id from the bitstream table. > > 3. Identify “orphan” files in /assetstore1 by reversing the “EXISTS” > logic in step 2 above, to see if all the files found under /assetstore2 exist > in the bitstream table. > > > > I just can’t figure out why there are so many duplicates and why there are > tons of files 2 levels down from /assetstore1 that shouldn’t be there. > > > > Example: > > [cid:[email protected]] > > > > In this example, the hightlighted files at the bottom of the screen shot > shouldn’t be there. They’re all just 2 levels down and they should be in > other directories. Here’s where I’m finding the duplicates. The highlighted > files are in this incorrect subdirectory, but they’re ALSO in the correct > subdirectory. > > > > I’m likely going to end up having to write a script to delete these “orphan” > files. I thought maybe the DSpace cleanup script would take care of this, > but it doesn’t. (SUGGESTION FOR FUTURE RELEASE!!! ☺) > > > > THANKS, > > Sue Walker-Thornton > > (w): (757) 864-2368 > > (m): (757) 506-9903 > > > > > > -----Original Message----- > From: [email protected] [mailto:[email protected]] On Behalf Of > helix84 > Sent: Thursday, October 25, 2012 5:37 PM > To: Thornton, Susan M. (LARC-B702)[LITES] > Cc: dspace-tech > Subject: Re: [Dspace-tech] Assetstore structure question > > > > On Thu, Oct 25, 2012 at 11:02 PM, Thornton, Susan M. > > (LARC-B702)[LITES] <[email protected]<mailto:[email protected]>> > wrote: > >> Hi Helix, > >> We had a hardware failure on one of our assetstore file systems recently > and I'm working on verifying the restore worked properly (it didn't as I'm > finding lots of duplicate files and files at depth level 3). > >> > >> Yes, I understand that in the bitstream table, "store_number" tells > DSpace where to find the file and the "internal_id" tells DSpace 2 things: > >> 1. which subdirectory in that store_number the file > resides (according to the first 6 digits) > >> 2. what the actual file name is > >> > >> What a mess! > >> Thanks for your reply - it was as I thought it was! > >> Sue > > > > You may be lucky (as strange as it may sound) if you're getting whole files, > which would indicate that just the filesystem was corrupted, not the contents > of bitstreams. It should be possible to find the right location for files. > > > > If I were you, I'd make a list of all the files (find /dspace/assetstore > -type f) then look up their names in the "bitstream" table, then take the > corresponding checksum and run md5sum on the bitstream to verify its > integrity. > > > > As for locations within assetstore, if the bitstream name is correct, it's > easy to find the directory by the first three numbers of bitstream filename. > > > > If bitstream filename is corrupted, it would be harder. You could make a > list of all their md5 sums and then do the lookup I mentioned in reverse - > find the bitstream filename by its md5 sum in the "bitstream" table. > > > > Good luck! Just ask if you're in doubt. > > > > Regards, > > ~~helix84 ------------------------------------------------------------------------------ The Windows 8 Center In partnership with Sourceforge Your idea - your app - 30 days. Get started! http://windows8center.sourceforge.net/ what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/ _______________________________________________ DSpace-tech mailing list [email protected] https://lists.sourceforge.net/lists/listinfo/dspace-tech

