Re: LyX New Spell Checker Interface

Asger Alstrup Nielsen Sun, 14 Mar 1999 08:31:58 -0500
> Here is a rough outline of what I had in mind for LyX new Spell
> Checker.  

Thank you for taking the time to do this!

> Let me know what you think.  I will right the interface to
> aspell and rely on someone else to write the interface for Ispell.

Ok, that sounds fair to me.

I have some comments to the code, and then some comments to the design as such.

> class SpellChecker;
> 
> class DictManager {
> public:
>   void add_sc(SpellChecker *); 
>   void remove_sc(SpellChecker *);
>   SC_Error save_wls(); //save all word lists;
>   // a bunch of other dictionary management methods
> };

I suggest to rename these to "addSpellChecker" and "removeSpellChecker" since
that's the LyX way of naming.  Also, non-standard abbreviations are bad.

What is this DictionaryManager used for?

> class SpellChecker {
>   typedef Itr    ... // A forward (but preferably bidirectional) iterator
>   typedef EndItr ... // n iterator such that if i an Itr and e is an EndItr
>                      // i == e if and only if i is at the end of the iterator
>                      // range.  This can be the same as Itr and in most cases
>                      // it probably is.

I don't think you should distinguish between the two kinds of iterators.  An
iterator can point to any element in the list, and at the end of the list.

The spellchecker should work on strings.  The LyX data structure will be small
strings, so you should simply use LString::const_iterator to make life simplest
for us.

> public:
>   SpellChecker(DictManager *);
> 
>   void set_language(const string &lang);
>   string language() const;

I think we need methods to define the encoding of the data.
LyX can provide MIME-type encoding flags for the spell checker.  
This might be needed in the future, so add:

        void set_encoding(string const & encoding);
        string encoding() const;

>   void restart(Itr c, EndItr e); 
>   // starts or restarts the process with a new iterator pair.  Will stop 
>   // when it encounters a misspelling or reaches the end.
>   // If spell checking has already started c must be within one word of
>   // where the spell checker stops.  If you need to skip over an area
>   // use the scan method.

I suggest that you comment your code like this:

/** Starts or restarts the process with a new iterator pair.
    Will stop when it encounters a misspelling or reaches the end.
    If spell checking has already started, "start" must be within one
    word of where the spell checker stopped.  If you need to skip over
    an area, use the scan method.
 */
void restart(LString::const_iterator start, LString::const_iterator end);

Then, our automatic Documentation tool Doc++ will create some nice HTML pages
of the interface.

>   void skip_word();
>   // skip past the current mispelled word

Here, you should use

/// Skip past the current misspelled word
void skipWord();

>   void continue();
>   // continues the process.  Will stop when it enconters a misspelling
> 
>   string  word() const;
>   // returns the misspelled word.
>   Itr     word_begin() const;
>   // returns an iterator pointing to the beggining of the misspelled word
>   Itr     word_end()   const;
>   // returns an iterator pointing to the end of the misspelled word

Maybe you could use a pair instead:

/// Returns a range that surrounds the misspelled word
pair<LString::const_iterator start, LString::const_iterator end>
word_boundary() const;

>   bool at_end() const;
>   // returns true if the spell checker reached the end of the iterator range.

Since later you present the "cur" and "end" methods, there is no need for this
one.  The user can just do "if (spellchecker.current() == spellchecker.end()) {
.. }"

>   void scan_ahead(Itr stop);
>   // skips to position stop gatering any nessary state information.

Maybe this should be called "moveTo"?

>   void reset();
>   // Resets all state information.

What is this used for?

>   void scan(Itr begin, Itr stop, EndItr end);
>   // Scans from begin until end gatering any nessary state
>   // information.  This has the potential of being much more efficent
>   // if Itr is bidirectional
> 
>   // Note: In order to properly support some of Aspell advanced spell
>   // checking modes it is important that you use the above three methods to 
>   // move around the document.

What is the purpose of this method "scan"?
 
>   Itr    cur() const;
>   // returns the location where the spell checker will resume checking
>   EndItr end() const;
>   // returns the end iterator
> 
>   WordList suggestions();
>   // returns a list of suggestions for the current word

This word list should be a vector<LString const &>.
 
>   void add_personal(const string &word);
>   // add a word to the personal word list
>   void add_session(const string &word);
>   // add a word to the session or "ignore all" word list
> 
>   void save_all_wls();
>   // save all relevent word lists
> 
>   void clear_session();
>   // clear the session word list
> 
>   bool ignore_replacements();
>   bool ignore_replacements(bool);
>   void store_replacement(const string &cor, bool memory = true);
>   // if ignore_replacements is not set return store the replacement pair
>   // the memory parameter should be ignored
> };

What is this ignore_replacement state?

The design presents an interface on words.  I think the design is fairly
complete.  However, the stuff about scanning, skipping and all that is a bit
complicated.  Why is this needed?

In general, I prefer to have a minimal interface.  The one you present has many
methods that overlap.  We should try to cut these to a minimum.

Also, the language and encoding of a spellchecker is probably fixed.  Can a
spell checker change disctionary?  I don't think ispell can, so we can't assume
that.

So I think we can get away with just passing these in the constructor of the
spellchecker:

SpellChecker(string const & language, string const & encoding);

We need to standardize the language strings.  Maybe we should use the two
letter ISO codes (us, de, dk, au, en)?

We don't need access methods for asking the spellchecker which language and
encoding it is.

So based on your design, here is my proposal:

class SpellChecker {
/** Create a spellchecker with given language and encoding.
    Also, a bunch of spell checker specific parameters can
    be specified. */
SpellChecker(string const & language, string const & encoding,
                 string const & parameters);

/** What is the status of the spell checker?
    Before you use a spell checker, you have to make sure
    that it is ok.  The user might not have any spell checker
    installed, so we have to return an error string in this
    case.
    If the spell checker is ok, we return an empty string here.
 */
string errorStatus() const;

/** Define the string the spell checker should work on.
    If it already is started, "start" must be within one
    word of where the spell checker stopped last time.  If you need 
    to skip over an area of the string, use the moveTo method.
 */
void set_string(LString::const_iterator start,
                    LString::const_iterator end);

/** Starts or restarts the spell checking.
    The spell checker will stop when it encounters a misspelling 
    or reaches the end of the string.
 */
void spellcheck();

/// Skip the current word
void skipWord();

/** Return the current misspelled word.
    If we reached the end of the string, the returned string is
    empty. 
  */
LString word() const;

/// Returns a range that surrounds the misspelled word
pair<LString::const_iterator, LString::const_iterator>
word_boundary() const;

/** Skips to given position.
    This method is necessary because the spell checker
    might gather state information. 
  */
void moveTo(LString::constr_iterator position);

/// Returns the location where the spell checker is in the string
LString::const_iterator current() const;

/// Returns the location where the spell checker will end
LString::const_iterator end() const;

/// Returns a list of suggestions for the current word
vector<LString const &> suggestions();

/// Add a word to the personal word list
void addPersonal(LString const & word);

/// Add a word to the session or "ignore all" word list
void addSession(LString const & word);

/// Add a global replacement
void addReplacement(LString const & word, LString const & replacement);

/// Save all relevant word lists, replacements, etc.
void save();
};

Greets,

Asger
Re: LyX New Spell Checker Interface

Reply via email to