Good afternoon,

If there are files in all the right places, it might be less effort to
export the entire repo and then import into a fresh instance.  Maybe.  I
know you'uns have a lot of stuff.

B--

>>> On 10/26/2012 at 11:58 AM, in message
<39919fc910d0004bbec8514947c345759b734f1...@ndjsscc07.ndc.nasa.gov>,
"Thornton,
Susan M. (LARC-B702)[LITES]" <[email protected]> wrote:
> What I'm in the process of doing to validate the integrity of the
restore is 
> this:
> 
> 1.    List all files of type -f in the /assetstore1 directory, then
load it 
> into a temporary table with 3 columns in Postgres:
> 
> a.  Record_id (unique identifier)
> 
> b.  sub_directory (contains the 1st 6 digits of all the file names
> 
> c.  file_name (contains the entire file name
> 
> 2.    Execute a sql query that looks at all rows in the bitstream
table 
> where store_number = 1 and see if the first 6 digits match the
subdirectory 
> the file was found in, on assetstore1 and the file name matches the 
> internal_id from the bitstream table.
> 
> 3.    Identify “orphan” files in /assetstore1 by reversing the
“EXISTS” 
> logic in step 2 above, to see if all the files found under
/assetstore2 exist 
> in the bitstream table.
> 
> 
> 
> I just can’t figure out why there are so many duplicates and why
there are 
> tons of files 2 levels down from /assetstore1 that shouldn’t be
there.
> 
> 
> 
> Example:
> 
> [cid:[email protected]] 
> 
> 
> 
> In this example, the hightlighted files at the bottom of the screen
shot 
> shouldn’t be there.  They’re all just 2 levels down and they
should be in 
> other directories.  Here’s where I’m finding the duplicates.  The
highlighted 
> files are in this incorrect subdirectory, but they’re ALSO in the
correct 
> subdirectory.
> 
> 
> 
> I’m likely going to end up having to write a script to delete these
“orphan” 
> files.  I thought maybe the DSpace cleanup script would take care of
this, 
> but it doesn’t.  (SUGGESTION FOR FUTURE RELEASE!!! ☺)
> 
> 
> 
> THANKS,
> 
> Sue Walker-Thornton
> 
> (w):  (757) 864-2368
> 
> (m):  (757) 506-9903
> 
> 
> 
> 
> 
> -----Original Message-----
> From: [email protected] [mailto:[email protected]] On Behalf Of

> helix84
> Sent: Thursday, October 25, 2012 5:37 PM
> To: Thornton, Susan M. (LARC-B702)[LITES]
> Cc: dspace-tech
> Subject: Re: [Dspace-tech] Assetstore structure question
> 
> 
> 
> On Thu, Oct 25, 2012 at 11:02 PM, Thornton, Susan M.
> 
> (LARC-B702)[LITES]
<[email protected]<mailto:[email protected]>> 
> wrote:
> 
>> Hi Helix,
> 
>>     We had a hardware failure on one of our assetstore file systems
recently 
> and I'm working on verifying the restore worked properly (it didn't
as I'm 
> finding lots of duplicate files and files at depth level 3).
> 
>>
> 
>>      Yes, I understand that in the bitstream table, "store_number"
tells 
> DSpace where to find the file and the "internal_id" tells DSpace 2
things:
> 
>>                 1.      which subdirectory in that store_number the
file 
> resides (according to the first 6 digits)
> 
>>                 2.      what the actual file name is
> 
>>
> 
>> What a mess!
> 
>> Thanks for your reply - it was as I thought it was!
> 
>> Sue
> 
> 
> 
> You may be lucky (as strange as it may sound) if you're getting whole
files, 
> which would indicate that just the filesystem was corrupted, not the
contents 
> of bitstreams. It should be possible to find the right location for
files.
> 
> 
> 
> If I were you, I'd make a list of all the files (find
/dspace/assetstore 
> -type f) then look up their names in the "bitstream" table, then take
the 
> corresponding checksum and run md5sum on the bitstream to verify its

> integrity.
> 
> 
> 
> As for locations within assetstore, if the bitstream name is correct,
it's 
> easy to find the directory by
 the first three numbers of bitstream
filename.
> 
> 
> 
> If bitstream filename is corrupted, it would be harder. You could
make a 
> list of all their md5 sums and then do the lookup I mentioned in
reverse - 
> find the bitstream filename by its md5 sum in the "bitstream" table.
> 
> 
> 
> Good luck! Just ask if you're in doubt.
> 
> 
> 
> Regards,
> 
> ~~helix84

------------------------------------------------------------------------------
The Windows 8 Center 
In partnership with Sourceforge
Your idea - your app - 30 days. Get started!
http://windows8center.sourceforge.net/
what-html-developers-need-to-know-about-coding-windows-8-metro-style-apps/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to