I thought about the ParseTree class a bit today and how it should 
work. OK, it was a beautiful day and I was outside too. ;-)

I'm not too tied to any of the details (especially names), but I 
figured that it might help to voice some of my thoughts to the list 
before hammering away at anything.

For each operator, there should be a derived class. These will each 
have their own way of parsing a query and combining subtrees (i.e. 
results). So there will be a minimum of an AndParseTree, OrParseTree, 
NotParseTree, and ExactParseTree. Others could be added pretty easily 
from there (e.g. NearParseTree).

The "boolean" method will really correspond to the base-class since a 
boolean query is composed of operators. This Parse method would pick 
subclasses as appropriate. Otherwise, for a given method, htsearch 
will create the appropriate ParseTree and hand off the user query.

The class would be responsible for assembling itself to fit the query 
as well as merging results after searching. (I'm not sure whether it 
should do the searching itself or if it should be more of a 
container.) Of course before the searching is performed, the Fuzzy 
method should be called, passing through the list of fuzzy objects. 
This way allows some algorithms to be performed selectively if 
necessary. The Fuzzy object essentially already returns a StringList, 
so the Parse(StringList *) method can be used.

     // Like the List and Stack and other container classes
     // Release disconnects the branches
     void               Release();
     // Destroy disconnects branches AND frees them
     void               Destroy();

     // Parse either a base string itself or a list of strings
     // Returns either OK or NOTOK as to the correctness of the query
     virtual int                Parse(String);
     virtual int                Parse(StringList *);

     // Combine right and left lists (if present) according to our specific
     // operator type (e.g. AND, OR, NOT, NEAR, etc.)
     virtual void       Combine();

     // If passed a list of Fuzzy methods, use them to fill out the tree
     // (note that some subclasses may choose to ignore this if desired)
     virtual void       Fuzzy(List);

private:
     // One or the other of these could be empty
     ParseTree          *right;
     ParseTree          *left;

     WeightWord         data;
     ResultList         *results;
[Various helper routines for cleaning up user input]
e.g. trimming punctuation, etc.

The WeightWord and ResultList classes will need some modifications as 
well, not the least of which is adding a mask to WeightWord to allow 
searches to be restricted based on field.

One final concern: this model makes it very difficult to add the 
AltaVista url: syntax for adding in URL restricts. I'm not sure how a 
ParseTree should pass up information like this about how the search 
should be performed. For now I'm content to shelve the problem until 
we can get a clean, working replacement for the current code. ;-)

What do people think?

-Geoff

------------------------------------
To unsubscribe from the htdig3-dev mailing list, send a message to
[EMAIL PROTECTED] 
You will receive a message to confirm this. 


Reply via email to