|
Hello, I am creating a search
engine for indexing and searching into text files containing C# source
code for an Information Retrieval course at an Italian university. I am planning the project and I'm trying to find a smart way to index the files. I am using an open source C# parser to parse the files, so that it creates a DOM of each file containing namespaces, classes and methods definitions. The problem is, I want to be able to search full text, by namespace name, class name and method name. What kind of indexing should I perform? I'm pretty new to Lucene, so I may still be missing something, but if I want, say, search a method called "Print" I should be indexing my files on a field called, for example, "method", whose value is the name of the method. But each C# source file may be containing multiple methods, so I was wondering what indexing strategy I should adopt. Is it possible to create two fields with the same name in the same document? Thanks! |
