Several recent issues (DS-2337, DS-2487, and perhaps DS-2488) suggest that we should step back and take a long look at how we are using the Solr 'statistics' core.
Solr seems designed for use as a cache. That's how the other cores are used: they can be refreshed from data in the database and the assetstore. But the statistics core is treated as durable storage, a sink (perhaps the only one) for event data. If you don't keep your 'dspace.log's forever, there may be NO WAY to recover statistical records in the event of disaster or a schema change. At the very least it can require some fancy footwork if stat.s are to survive an upgrade. The Solr maintainers have basically said "don't do that": https://wiki.apache.org/solr/HowToReindex#Using_Solr_as_a_Data_Source I think we need to give some more thought to how we can readily preserve usage records over DSpace upgrades and system failures. I should admit here that I am skeptical of using Solr as the statistics store *at all*, however well it works most of the time. But it is not my purpose in this note to advocate for something different. -- Mark H. Wood Lead Technology Analyst University Library Indiana University - Purdue University Indianapolis 755 W. Michigan Street Indianapolis, IN 46202 317-274-0749 www.ulib.iupui.edu
signature.asc
Description: Digital signature
------------------------------------------------------------------------------ Dive into the World of Parallel Programming The Go Parallel Website, sponsored by Intel and developed in partnership with Slashdot Media, is your hub for all things parallel software development, from weekly thought leadership blogs to news, videos, case studies, tutorials and more. Take a look and join the conversation now. http://goparallel.sourceforge.net/
_______________________________________________ Dspace-devel mailing list Dspace-devel@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/dspace-devel