On Wed, Feb 06, 2008 at 05:23:54PM -0500, Tellier, Stephane wrote:
> Also, I need some DSpace experts here, but if I'm not wrong, the maximum 
> number of subdirectories in the default assetstore could reach up to 1000000 
> (100 X 100 X 100). It could be a problem if for each bitstream, DSpace have 
> to check where it must place it (not sure though that this can be an issue).

That should not be a problem.  The internal name of a bitstream is a
hash over its content.  The path to the bitstream within the
assetstore is formed by taking successive pairs of digits from that
hash as directory names.  So, once the name of a bitstream is known,
the path can be calculated in negligible time.  No searching is
involved.

The most costly thing that could happen is in the highly unlikely case
of a hash collision, it could be found that a bitstream with that name
(and hence that path) already exists.  I don't recall what DSpace does
in this case, but the Thing To Do would be to rehash with different
salt and try again, until an unused name is found.  That would be
costly, but the hashing function should be *really* good at making
the need for this unlikely.  I don't have the code in front of me, but
it's something like SHA-1 over the file's content concatenated with
the current date/time.  I wouldn't spend much time worrying about this
case.

-- 
Mark H. Wood, Lead System Programmer   [EMAIL PROTECTED]
Typically when a software vendor says that a product is "intuitive" he
means the exact opposite.

Attachment: pgpXYUJZcN2Bq.pgp
Description: PGP signature

-------------------------------------------------------------------------
This SF.net email is sponsored by: Microsoft
Defy all challenges. Microsoft(R) Visual Studio 2008.
http://clk.atdmt.com/MRT/go/vse0120000070mrt/direct/01/
_______________________________________________
DSpace-tech mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/dspace-tech

Reply via email to