As to roadmaps, I myself hope that discussion on this list and crystallized in the Duraspace Jira issue-tracking system will make GSearch's direction completely transparent.
I can't but point out that a very popular and well-supported XML language for describing mappings from XML metadata to the Solr (XML) document format already exists: XSLT. --- A. Soroka Online Library Environment the University of Virginia Library On Oct 12, 2011, at 8:44 PM, Tom Cramer wrote: > Gert, Adam, > > In a similar but different vein, at RIRI this year, several of us had > conversations about opportunities to share code and componentry between the > Hydra and Islandora projects. One possibility that arose was generalizing > solrizer[1] to act as a general purpose Fedora-> solr indexing tool. > Currently solrizer is an integral component in the Hydra stack, and is Ruby > on Rails code that uses the models defined in a given Hydra head to > automatically index data stream contents into a solr index which is then used > for both search and read operations (via Blacklight). > > To echo elements of Adam's proposal, the concept that emerged was to take the > models that define the mappings of data streams to fields of the solr index > out of solrizer's Ruby code, and instead express them as XML files. These XML > files could then be stored in a Fedora repository. This would remove the > platform dependency on Ruby development; it would also keep the mappings on > how to interpret / index Fedora objects in the repo instead of application > code. > > Matt Zumwalt (who may be kicking me under the table from Minneapolis right > now) will be examining this as a possible architectural direction for > solrizer moving forward. > > I don't know what opportunities, if any, there are for cross-pollenation or > convergence between solrizer and GSearch, but it seems that roadmap sharing > at the very least would be healthy. > > - Tom > > > [1] https://github.com/projecthydra/solrizer > > > > > On Oct 12, 2011, at 7:41 AM, aj...@virginia.edu wrote: > >> Here's a less straightforward idea, which I haven't put into a Jira issue >> because it warrants discussion, if it evenis to become part of the roadmap. >> >> At OR in Austin, I presented an indexing system (based partly on ideas from >> GSearch, but not on the GSearch codebase) that we at UVa are working on. One >> of the key principles of this system is that because discovery and >> presentation for repository contents are increasingly based on indexes, and >> because discovery and presentation are parts of curation (viewed broadly), >> it is worthwhile to move the configuration of indexing workflows inside the >> repository being indexed, so that indexing configuration "lives" alongside >> the indexed contents and can be managed through the same services. (In the >> example of our system, RELS-INT RDF connects metadata datastreams in >> indexable objects with indexer objects that contain indexing >> transformations.) >> >> I'd like to propose that the roadmap for GSearch include the task of making >> it possible for users to move configuration for indexing transformations >> (_not_ necessarily configuration for the connections between indexes and >> repositories, but only the configuration of indexing transformations) >> _inside_ the repositories being indexed. >> >> One key affordance that would become available would be to manage indexing >> transformations through the same APIs as are used for repository contents. >> Because changing an index transformation would no longer require altering >> material in the local GSearch install, but only the repository, all of the >> wonderful functionality that Fedora already supplies in of the core >> repository services would become available (e.g. XACML policy controls, >> metadata associations, a nice RESTful API, etc.). >> >> Doing this would require much careful thought as to how to model and >> structure representations of indexing transformations in the repository >> context, but it could have great benefits, as tools to manage indexing would >> be able to rely on work already done and in progress for the management of >> ordinary repository contents. >> >> >> --- >> A. Soroka >> Online Library Environment >> the University of Virginia Library >> >> >> >> >> On Oct 12, 2011, at 10:07 AM, Gert Schmeltz Pedersen wrote: >> >>> This message is meant to open for a discussion of the roadmap for GSearch. >>> It started in a small group, but we invite participation from the wider >>> group of fedora-developers. I copy this message to the fedora-users list so >>> that GSearch users are informed about the discussion, but to follow it >>> onwards and to contribute they have to subscribe to the fedora-developers >>> list. >>> >>> I will initiate the discussion with a status. GSearch 2.2 has been the >>> current release since December 2008. At OR2011 in Austin in June 2011 I >>> presented a plan for development of GSearch, see >>> https://conferences.tdl.org/or/OR2011/OR2011main/paper/view/416/127 . >>> Following that, I have provided GSearch 2.3, and the official release is >>> near. You can get the source at https://github.com/fcrepo/gsearch and >>> fedoragsearch.war from the DTU prerelease site at >>> http://www.cvt.dk/fedoragsearch/ and see the documentation page at >>> http://miranth.cvt.dk/fedoragsearch/ . >>> >>> Next step in the plan is to provide GSearch 2.4 by the end of the year. I >>> will use the issue tracker at >>> https://jira.duraspace.org/secure/IssueNavigator.jspa?mode=hide&requestId=10311 >>> to track the work, and I invite your feedback and contributions. Potential >>> committers may be enrolled, I already had some responses to my invitation >>> to potential committers at OR2011. Some of you may have heard at OR2011, >>> that I will retire by the end of the year. However, I will continue >>> part-time to support GSearch users on the fedora-users list and continue to >>> develop for GSearch and Fedora in partnerships with people, who have an >>> interest in that. >>> >>> The post-2.4 roadmap discussion can both be on this list and as new or >>> modified issues at the issue tracker. I think that members of the initial >>> small group will soon bring up issues. >>> >>> Gert >>> ------------------------------------------------------------------------------ >>> All the data continuously generated in your IT infrastructure contains a >>> definitive record of customers, application performance, security >>> threats, fraudulent activity and more. Splunk takes this data and makes >>> sense of it. Business sense. IT sense. Common sense. >>> http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ >>> Fedora-commons-developers mailing list >>> fedora-commons-develop...@lists.sourceforge.net >>> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers >> >> >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2d-oct >> _______________________________________________ >> Fedora-commons-users mailing list >> Fedora-commons-users@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/fedora-commons-users > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ > Fedora-commons-users mailing list > Fedora-commons-users@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-users ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Fedora-commons-users mailing list Fedora-commons-users@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-users