Hi Robert,

Le 14 juil. 05, à 23:39, Robert Graham a écrit :

...This is mostly a question for the mentors on the project, but ideas
are welcome. I posted to dev mostly to have it backed up...

Note that posting to dev@ only is fine, we're monitoring it (but feel free to ping me directly if questions stay unanswered for too long).

...I've been through the prototype code and I've even gotten a couple of
the TODO's taken care of (specifically the matching multiple key-value
pairs and most of a refactoring of slop/xml-to-snippets.xsl)...

Cool. I'll ask for your access to the whiteboard/refdoc subdirectory in the next few days, so that you can commit this.

 I'm not
sure how to accomplish the selecting of multiple codebases, but I'm
interested in ideas..

This can come later, for example using an XML file which maps symbolic codebases names to actual paths, and generating a virtual "top-level directory" from this file.

. I thought we should make a minor change to the
doktor comment syntax in adding commas between key-value pairs to make
it more reasonable to process...

Good idea, go for it!

 Past all that I'm looking for a little
more direction. I understand the goal and the prototype as it is, but
I'm a little confused as to the next step...

Sorry to have left you with very little info until now, the timing was bad with me being offline last week. But I'll have more time to follow up now (until about August 10th when I'm going to be offline for 10 days again).

The steps that I'm seeing are the following (some of them are done already, at least partially):

1. Extract snippets from the various types of source files: XML, java, text

2. Convert these snippets to an XML form that is easily indexable with Lucene, generating Lucene "fields" for all important pieces of information: snippet key, snippet type, title, etc.

2b. Also generate "navigation documents" which Lucene will use to find all snippets. This is shown in the prototype already.

3. Crawl and index the generated XML documents with Lucene, at first using the Lucene block out of the box, I assume. Some manual work (like starting the index creation from an URL) is ok at this stage, we're trying to demonstrate the full chain before implementing everyting.

4. Create the required Lucene queries to put together snippets coming from different source files but having the same key (e.g. all "FileGenerator" snippets). I might need to add @doktor stuff to existing code and samples so that you can see better how this should work.

5. Transform the results of these queries to XML document in a publication-neutral format, where one document contains all the info and code excerpts provided by snippets having the same key.

Let me know if you need more info about this!

-Bertrand

Attachment: smime.p7s
Description: S/MIME cryptographic signature

Reply via email to