I am working on some sort of search mechanism to link a requirement (i.e. a query) to source code files (i.e., documents). For that purpose, I indexed the source code files using Lucene. Contrary to traditional natural language search scenario, we search for code files that are relevant to a given requirement. One problem here is that the source files usually contain a lot of abbreviations, words joint by _ or combination of words and/or abbreviations (e.x., getAccountBalanceTbl). I am wondering whether anyone of you already did indexing of (source) files or documents which contain that kind of words. -- View this message in context: http://www.nabble.com/Indexing-source-code-files-tp15738615p15738615.html Sent from the Lucene - Java Users mailing list archive at Nabble.com.
--------------------------------------------------------------------- To unsubscribe, e-mail: [EMAIL PROTECTED] For additional commands, e-mail: [EMAIL PROTECTED]