Hi, I'm new to this community. I want to use mahout as a component of an enterprise search project. The project is at conceptual phase. My business need is to be able to find everything about a related task and reorganize the output as a new view. The results should be actionable. Also the system should be integrated with software development environment tools; Subversion; JIRA and Redmine; Sharepoint Blogs; wikis and people ( active directory) Everything means, files, tools and people. Files are mostly text based (word, pdf, source files);to search audio and video files are further needs.
Where does mahout; Lucene/solr and UIMA framework fit in the following scenario? And what are the system requirements to setup a development environment? X is a new project team member in a software development firm. Her project is a 10 years-old maintainence project mainly; however customers want small development requests on that platform. Her boss wants her to prepare a software requirement specification document for a new request. Since she hasn't prepared an SRS before; she wants to find previously prepared documents, and asks her collegues to give her a sample. Her friend gives her a sample based on a very ancient version of SRS from her local computer. The company has Windows file server, a new content management system (portal); also some projects use Subversion to store the docs and also wikis. 1. There should be a platform that can search files in all these environments. 2. The system should understand SRS is an outcome of software requirements engineering or analysis process. The system should understand SRS, software requirements specification and functional design descriptions are similar terms. 3. The company has manuals, templates and process definitions about requirements engineering and has an SRS template which supersedes other versions. While searching the system should list organizational docs and then project docs related to SRSes. 4. The project has different SRSes written through 10 years. So the system should list that specific projectsSRS templates indicationg version conflicts between org. document templates and projects... 5. Also the system should list the people who involve requirements engineering process previously in that project first; then in other projects. 6. Also system should have a suggestion mechanism. The system should know the domain of the project X is workin on and its sub parts. For ex, X is working on an e-commerce project. And the new request is about mobile payments. In the same company but in a different project; a project team is working on e-wallet projects for a bank. Based on her profile, system should be able to suggest people, tools and outcomes from the other project relating with payments domain. The domain identification and grouping the related docs, tools and people in an existing system is nearly not possible manually. I want the system can identify and cluster the related things itself and also learn and improve the results by user feedback. Also, some people should give input to the system by classifying the concepts for the system. Like for example; I have organizational assets; document; tools; people. The documents are project docs and organizational docs and they are related. This can be a guidance for the system. I think carrot2 is doing sth very similar to what I say; but it has got file limitation.Anyway, I need a roadmap to initiate a project like this.Where should I start? Thanks,
