On Wed, Mar 6, 2013 at 3:19 PM, James, Eric <eric.ja...@yale.edu> wrote:

> in which akubra is described to have a non-database dependent hash driven
> file system vs the date/time algorithm of the default module.  But I'm not
> sure what that means in terms of real advantages in maintenence and
> performance.


The javadocs at


http://fedora-commons.org/documentation/3.2/javadocs/fedora/server/storage/lowlevel/akubra/HashPathIdMapper.html

describes the arrangement of the hash tree and configuration details;  it
seems to be a common idiom (I've implemented something similar in the
past).  The fedora installation we're using at FLVC.org for our islandora
development effort uses the '##' configuration out of the box - 256
top-level directories - for fedora/data/objectStore and
fedora/data/datastreamStore.

Our current plan, to support on the order of 10^6 objects, is to use '#'
for the top level, that is, 16 top level directories, and mount individual
partitions for each of the datastreamStore directories, so we can use our
volume management software to grow each separate partition to 2-3 TB.   The
objectStore will be left on one filesystem (it's much, much smaller).

This will allow us to use fast SAS drives for the objectStore and cheaper
SATA drives for the datastreams.

That's the current thinking, anyway.   I have not tested if the storage
system is symlink-agnostic yet, which would allow more flexibility along
the lines of

 datastreamStore/0  ->  /data/partition-1
 datastreamStore/1  ->  /data/partition-1
 datastreamStore/2  ->  /data/partition-2
 datastreamStore/3  ->  /data/partition-2

The problem with a deep tree is that a complete traversal is very slow.
 But a direct lookup, based on the hash of the datastream URI, is very fast.

I'd love to hear how other people are doing this, or opinions on the above
plan.

-Randy Fischer
------------------------------------------------------------------------------
Symantec Endpoint Protection 12 positioned as A LEADER in The Forrester  
Wave(TM): Endpoint Security, Q1 2013 and "remains a good choice" in the  
endpoint security space. For insight on selecting the right partner to 
tackle endpoint security challenges, access the full report. 
http://p.sf.net/sfu/symantec-dev2dev
_______________________________________________
Fedora-commons-users mailing list
Fedora-commons-users@lists.sourceforge.net
https://lists.sourceforge.net/lists/listinfo/fedora-commons-users

Reply via email to