2009/3/6 Chris Mulligan <[email protected]>:
> So I've been playing with Whoosh a little this morning. Not really
> integrating it, just pulling out ticket and wiki information with my little
> http://trac-hacks.org/wiki/TracMergeScript Trac object, which makes it
> trivial to do things like iterate over all tickets. There's definite
> potential to this approach.
>
> Repository search would be awesome, but we should take care not to tie it
> too closely to svn. We're running a multirepos mercurial forest, which would
> require a very different implementation than a single SVN repository. I
> definitely used and really liked RepoSearch though.
RepoSearch doesn't use anything SVN specific, it uses the Trac VC
interface exclusively. That said, it was written before the multirepo
support, so it no doubt needs some love.
>
> whooshtest.py:
> import os, os.path
> import sys
> from whoosh.fields import *
> from whoosh import index
> from whoosh.qparser import MultifieldParser
> sys.path.append('tracmerge')
> from ptrac import Trac
>
> schema = Schema(id=ID(stored=True, unique=True), type=ID,
> keywords=KEYWORD(scorable=True),
> component=KEYWORD, milestone=TEXT, summary=TEXT(stored=True),
> content=TEXT, changes=TEXT)
>
> #If we don't have an index directory create one and index
> try:
> ix = index.open_dir('index')
> except:
> if not os.path.exists('index'):
> os.mkdir('index')
> ix = index.create_in('index', schema=schema)
> writer = ix.writer()
>
> t = Trac('../dev/')
>
> for tid in t.listTickets():
> print tid
> ticket = t.getTicket(tid)
> chgs = []
> for chg in ticket['ticket_change']:
> if chg['field'] == 'comment':
> chgs.append(chg['newvalue'])
> writer.add_document(id=unicode(tid), type=u'ticket',
> summary=ticket['summary'], content=ticket['description'],
> keywords=ticket['keywords'],
> component=ticket['component'], milestone=ticket['milestone'],
> changes='\n\n'.join(chgs))
>
> for pageName in t.listWikiPages():
> print pageName
> pageDetails = t.getWikiPageCurrent(pageName)
> writer.add_document(id=pageName, type=u'wiki',
> summary=pageDetails['comment'], content=pageDetails['text'])
> writer.commit()
>
> #search
> searcher = ix.searcher()
> parser = MultifieldParser(["content", 'keywords', 'component', 'milestone',
> 'summary', 'changes'], schema = ix.schema)
>
> input = sys.argv[1]
> query = parser.parse(input)
> results = searcher.search(query)
> print results
> for res in results:
> print res
>
>
>
>
>
>
> On Thu, Mar 5, 2009 at 9:41 AM, Jeff Hammel <[email protected]> wrote:
>>
>> On Thu, Mar 05, 2009 at 11:08:33AM +0100, Christian Boos wrote:
>> >
>> > W. Martin Borgert wrote:
>> > > On 2009-03-04 18:06, Chris Mulligan wrote:
>> > >
>> > >> This is motivated entirely by a local need. As our primary internal
>> > >> tracs
>> > >> grow (thousands of tickets and wiki pages) it's becoming harder and
>> > >> harder
>> > >> for users to find already existing content. They end up making lots
>> > >> of
>> > >> dupes, making the problem even worse the next time.
>> > >>
>> > >
>> > > Yes, the trac search facilities are good, but sometimes not good
>> > > enough. Sometimes one likes to search "the whole thing", e.g.
>> > > including PDFs in the SVN trunk etc. I'm not sure, if whoosh
>> > > addresses this problem.
>> > >
>> >
>> > Searching content in the repository is addressed by the RepoSearch
>> > plugin on trac-hacks, if I'm right.
>> >
>> > http://trac-hacks.org/wiki/RepoSearchPlugin
>> >
>> > Looking for content inside non-text file like a .pdf would require an
>> > additional extraction/analyze step.
>>
>> Perhaps an infrastructure could be built such that filters could be
>> applied to mimetypes and the search is performed on the results of that
>> filter. For example, you could apply pdftotext as a filter for pdfs and
>> search the resulting text or you or antiword to (horrible) .doc files.
>>
>> > Also, I don't know if the plugin allows for searching the path names,
>> > useful for locating some source file you have no idea in which
>> > subproject or branch it is ;-)
>> >
>> > -- Christian
>> >
>> >
>> > >
>>
>>
>
>
> >
>
--
"Life? Don't talk to me about life." - Marvin
--~--~---------~--~----~------------~-------~--~----~
You received this message because you are subscribed to the Google Groups "Trac
Development" group.
To post to this group, send email to [email protected]
To unsubscribe from this group, send email to
[email protected]
For more options, visit this group at
http://groups.google.com/group/trac-dev?hl=en
-~----------~----~----~----~------~----~------~--~---