I'd like to pull out and pick on three connected points in Mr. Birkland's 
elucidating remarks:

On Jun 16, 2011, at 11:26 AM, Aaron Birkland wrote:

> PidGenerator:
> This would need a new implementation to avoid shared state between Fedora 
> instances where possible.  The approach I took in the proof of concept is 
> simply a random UUID-based pid generator. Using Zookeeper or HBase itself to 
> coordinate between PidGenerator impls could work too -it looks like you are 
> investigating that approach.

Another approach would be to portion off identifier minting as an independent 
service "proxied" through a given repository or pool of repositories in a 
transparent way. In other words, to have a separate IdM service to which each 
repository could connect when a getPid call was made against that repo. Such a 
service could be packaged with Fedora installs and preconfigured so that 
out-of-the-box behavior would be as it now is, but advanced installs (using a 
pool of repositories) could adjust configuration to point at an external 
service. This would require some significant coding work, but for institutions 
with a complex IdM profile (in which various kinds of URIs are in play with 
institutionally-specific business logic connecting them), it might be very 
helpful. It might also be helpful to consortial groups who desire to maintain 
separate but interoperable repositories.

> FieldSearch:
> In a distributed environment, it probably does not make sense for each 
> instance to have its own independent field search index.  In the proof of 
> concept, I left this empty (i.e. a field search impl that never indexes or 
> returns results).   I don't think it would be easy to have HBase implement 
> field search, but have never tried.  I was thinking that it would be 
> necessary to erect field search as a standalone service that updates itself 
> asynchronously in response to messages from various running fedora instances.

Given the complex and varied needs of distributed service architectures and the 
likelihood that an institution constructing a distributed service architecture 
is also interested in indexing and discovery solutions that far outmatch those 
supportable by Field Search, it (externalizing Field Search) might be a very 
viable solution path for some. It could be "proxied" through repository 
services in the way that I suggest above for PidGenerator to continue 
supporting the current WS contracts (and the admin clients).

> ResourceIndex:
> I ignored this in the proof of concept, but in a distributed environment, it 
> would likely need to be deployed as an external service, updating itself 
> through messages or RPC.

Seconded, amen, +1, and other expressions of affirmation.

As Fedora becomes more and more interesting to systems integrators operating in 
environments in which large-scale distribution is simply normal, this kind of 
externalization will become more and more important to Fedora. Perhaps the RI 
and Field Search could eventually be treated as properly part of the domain of 
external services like GSearch. 

---
A. Soroka
Online Library Environment
the University of Virginia Library



------------------------------------------------------------------------------
EditLive Enterprise is the world's most technically advanced content
authoring tool. Experience the power of Track Changes, Inline Image
Editing and ensure content is compliant with Accessibility Checking.
http://p.sf.net/sfu/ephox-dev2dev
_______________________________________________
Fedora-commons-developers mailing list
[email protected]
https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers

Reply via email to