This project is mostly a text search project. You can get basic
functionality without doing any math of this sort. (The Lucene search
algorithms do a simplified and very fast version of one of the recommender
algorithms in Mahout.)

On Wed, Nov 16, 2011 at 6:38 AM, Burcu Buyukkagnici <[email protected]>wrote:

> Hi,
>
> Thanks for the resources. They, especially the blogs and its links, are
> very helpful for me to understand the things.I might have skipped the
> things related expert finding in the docs, because I haven't read
> everything yet. Regarding expert finding, do I need a social engine to
> create, keep and relate profiles or  lucene/solr, apache's other projects
> have this kind of functionality?
> I want people and the organization can identify the experts relating to a
> topic. sth like maven7.
> http://www.maven7.com/index_en.php?page=organizational
> The experts can be found from their products. For example, from Subversion
> annotations I can learn who previously work on a similar subject. I want to
> see the related developers, test specialist and related bugs. Also, based
> on dependency of code, I want to identify the people who might be affected
> by the changes that I am doing.
> I hope I can explain what I'm thinking. So profiling experts based on text
> files and database records mostly, can it be done with mahout, lucene etc?
>
> Thanks again,
>
> On Tue, Nov 15, 2011 at 9:34 AM, Yuval Feinstein <[email protected]
> >wrote:
>
> > My 2c: Start with getting all the relevant texts into one place, namely a
> > search index.
> > A good prototyping tool would be Solr.
> > You will need something like ManifoldCF:
> > http://incubator.apache.org/connectors/
> > for collecting documents from the various environments.
> > Here is Erik Hatcher's "Rapid Prototyping With Solr":
> >
> http://www.slideshare.net/erikhatcher/rapid-prototyping-with-solr-4312681
> > Once you get enough stuff into Solr, you will be able to search it
> easily.
> > Next, you can start using Mahout:
> >
> >
> http://www.lucidimagination.com/blog/2010/03/16/integrating-apache-mahout-with-apache-lucene-and-solr-part-i-of-3/
> > I would go for an iterative design, first taking a small sample of
> > documents from each environment,
> > trying the systems out, and then scaling.
> > Good luck,
> > Yuval
> >
> >
> > On Tue, Nov 15, 2011 at 9:12 AM, Burcu Buyukkagnici <[email protected]
> > >wrote:
> >
> > > Hi,
> > > I'm new to this community. I want to use mahout as a component of an
> > > enterprise search project. The project is at conceptual phase. My
> > business
> > > need is to be able to find everything about a related task and
> reorganize
> > > the output as a new view. The results should be actionable. Also the
> > system
> > > should be integrated with software development environment tools;
> > > Subversion; JIRA and Redmine; Sharepoint Blogs; wikis and people (
> active
> > > directory)
> > > Everything means, files, tools and people. Files are mostly text based
> > > (word, pdf, source files);to search audio and video files are further
> > > needs.
> > >
> > > Where does mahout; Lucene/solr and UIMA framework fit in the following
> > > scenario? And what are the system requirements to setup a development
> > > environment?
> > >
> > > X is a new project team member in a software development firm. Her
> > project
> > > is a 10 years-old maintainence project mainly; however customers want
> > small
> > > development requests on that platform. Her boss wants her to prepare a
> > > software requirement specification document for a new request. Since
> she
> > > hasn't prepared an SRS before; she wants to find previously prepared
> > > documents, and asks her collegues to give her a sample.
> > > Her friend gives her a sample based on a very ancient version of SRS
> from
> > > her local computer. The company has Windows file server, a new content
> > > management system (portal); also some projects use Subversion to store
> > the
> > > docs and also wikis.
> > >
> > >
> > >   1. There should be a platform that can search files in all these
> > >   environments.
> > >   2. The system should understand SRS is an outcome of software
> > >   requirements engineering or analysis process.  The system should
> > > understand
> > >   SRS, software requirements specification and functional design
> > > descriptions
> > >   are similar terms.
> > >   3. The company has manuals, templates and process definitions about
> > >   requirements engineering and has an SRS template which supersedes
> other
> > >   versions. While searching the system should list organizational docs
> > and
> > >   then project docs related to SRSes.
> > >   4. The project has different SRSes written through 10 years. So the
> > >   system should list that specific projectsSRS templates indicationg
> > > version
> > >   conflicts between org. document templates and projects...
> > >   5. Also the system should list the people who involve requirements
> > >   engineering process previously in that project first; then in other
> > >   projects.
> > >   6. Also system should have a suggestion mechanism. The system should
> > >   know the domain of the project X is workin on and its sub parts. For
> > ex,
> > > X
> > >   is working on an e-commerce project. And the new request is about
> > mobile
> > >   payments. In the same company but in a different project; a project
> > team
> > > is
> > >   working on e-wallet projects for a bank. Based on her profile, system
> > >   should be able to suggest people, tools and outcomes from the other
> > > project
> > >   relating with payments domain.
> > >
> > > The domain identification and grouping the related docs, tools and
> people
> > > in an existing system is nearly not possible manually. I want the
> system
> > > can identify and cluster the related things itself and also learn and
> > > improve the results by user feedback. Also, some people should give
> input
> > > to the system by classifying the concepts for the system. Like for
> > example;
> > > I have organizational assets; document; tools; people. The documents
> are
> > > project docs and organizational docs and they are related. This can be
> a
> > > guidance for the system.
> > >
> > > I think carrot2 is doing sth very similar to what I say; but it has got
> > > file limitation.Anyway, I need a roadmap to initiate a project like
> > > this.Where should I start?
> > >
> > > Thanks,
> > >
> >
>



-- 
Lance Norskog
[email protected]

Reply via email to