Hi All,
I picked up OODT today and immediately thought about an implementation of
Apache Gora [0] for abstracting persistence within the CAS metadata
catalogue.
Right now, for me, the persistence of my metadata catalogue to Lucene or
MySQL is sufficient and I have no immediate justification for using some
alternative storage mechanism however I noticed that there are a few areas
where OODT could generally benefit from the Gora implementation.
It is natural that product discovery via daemon driven CAS crawler (for
example) will fire product streams of varying nature towards the catalogue
storage mechanism. Lucene or MySQL my not be best best option to store such
streams of data and/or the best way to later retrieve that data. Gora would
enable a much more comprehensive variety of data stores to be available for
persistence of catalogue metadata and would also provide a much more
flexible model specifically geared towards better solutions for metadata
cataloguing. Currently we support Amazon DynamoDB, Accumulo, Cassandra,
HBase, HDFS, HSQLDB and MySQL. We have patches for Solr, MongoDB and
various file based stores. There is also interest to implement an Oracle
NoSQL DB.... don't ask.
I notice that the SolrIndexer tool implemented by Paul provides an
expressive number of options for indexing to your Solr HTTP server. The
gora-solr module would provide all these plus more.
I suppose this entirely depends on the requirements for expanding metadata
catalogues within the File Manager.
Is it envisaged that such an implementation is required for some use cases
or would be required?
As Gora builds on Hadoop principles, I suppose it would also enable folks
use their metadata catalogues in different, possibly useful, use-case
adaptable ways.
Just an initial thought.
Thanks
Lewis


[0] http://gora.apache.org
-- 
*Lewis*

Reply via email to