I remember the discussion, it seems to be a real improvement, and I will try to include it in version 2.4.
Gert On 12/10/2011, at 16.24, <aj...@virginia.edu> wrote: > I've offered one straightforward possibility (one that was discussed briefly > in Austin) at: > > https://jira.duraspace.org/browse/FCREPO-1010 > > Use Apache Tika for extraction: > Apache Tika is a toolkit that can extract text and metadata from a wide > variety of mimetyped formats (including PDF, via PDFBox). Employing Tika as > an extraction engine in GSearch would immediately expand enormously the > possible range of material over which GSearch could operate, and going > forward, GSearch would benefit from new parsers and better-performing parsers > created as part of that effort. > > > > --- > A. Soroka > Online Library Environment > the University of Virginia Library > > > > > On Oct 12, 2011, at 10:07 AM, Gert Schmeltz Pedersen wrote: > >> This message is meant to open for a discussion of the roadmap for GSearch. >> It started in a small group, but we invite participation from the wider >> group of fedora-developers. I copy this message to the fedora-users list so >> that GSearch users are informed about the discussion, but to follow it >> onwards and to contribute they have to subscribe to the fedora-developers >> list. >> >> I will initiate the discussion with a status. GSearch 2.2 has been the >> current release since December 2008. At OR2011 in Austin in June 2011 I >> presented a plan for development of GSearch, see >> https://conferences.tdl.org/or/OR2011/OR2011main/paper/view/416/127 . >> Following that, I have provided GSearch 2.3, and the official release is >> near. You can get the source at https://github.com/fcrepo/gsearch and >> fedoragsearch.war from the DTU prerelease site at >> http://www.cvt.dk/fedoragsearch/ and see the documentation page at >> http://miranth.cvt.dk/fedoragsearch/ . >> >> Next step in the plan is to provide GSearch 2.4 by the end of the year. I >> will use the issue tracker at >> https://jira.duraspace.org/secure/IssueNavigator.jspa?mode=hide&requestId=10311 >> to track the work, and I invite your feedback and contributions. Potential >> committers may be enrolled, I already had some responses to my invitation to >> potential committers at OR2011. Some of you may have heard at OR2011, that I >> will retire by the end of the year. However, I will continue part-time to >> support GSearch users on the fedora-users list and continue to develop for >> GSearch and Fedora in partnerships with people, who have an interest in that. >> >> The post-2.4 roadmap discussion can both be on this list and as new or >> modified issues at the issue tracker. I think that members of the initial >> small group will soon bring up issues. >> >> Gert >> ------------------------------------------------------------------------------ >> All the data continuously generated in your IT infrastructure contains a >> definitive record of customers, application performance, security >> threats, fraudulent activity and more. Splunk takes this data and makes >> sense of it. Business sense. IT sense. Common sense. >> http://p.sf.net/sfu/splunk-d2d-oct_______________________________________________ >> Fedora-commons-developers mailing list >> Fedora-commons-developers@lists.sourceforge.net >> https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers > > > ------------------------------------------------------------------------------ > All the data continuously generated in your IT infrastructure contains a > definitive record of customers, application performance, security > threats, fraudulent activity and more. Splunk takes this data and makes > sense of it. Business sense. IT sense. Common sense. > http://p.sf.net/sfu/splunk-d2d-oct > _______________________________________________ > Fedora-commons-developers mailing list > Fedora-commons-developers@lists.sourceforge.net > https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers ------------------------------------------------------------------------------ All the data continuously generated in your IT infrastructure contains a definitive record of customers, application performance, security threats, fraudulent activity and more. Splunk takes this data and makes sense of it. Business sense. IT sense. Common sense. http://p.sf.net/sfu/splunk-d2d-oct _______________________________________________ Fedora-commons-developers mailing list Fedora-commons-developers@lists.sourceforge.net https://lists.sourceforge.net/lists/listinfo/fedora-commons-developers