Re: predictive.el -- predictive completion of words as you type in Emacs

Toby 'qubit' Cubitt Wed, 01 Mar 2006 03:19:53 -0800

On Sun, Feb 26, 2006 at 04:48:56PM -0000, Phillip Lord wrote:
> > The predictive mode package adds a predictive completion minor mode
> > to Emacs. The sources are too big to post here, but are available
> > from:  
> > 
> > http://www.dr-qubit.org/download.php?file=predictive/predictive.tar.gz
> > 
> > The package's web page can be found at:
> > 
> > http://www.dr-qubit.org/emacs.php
> 
> 
> Looks interesting, but rather like pabbrev.el which does much the same
> thing. Have you tried both?


Wish I'd found pabbrev.el when I was first looking for such a package!
I would probably have contributed code to it instead of writing my
own. But maybe it didn't exist at the time. (I was looking about two
years ago, and predictive.el has been around for almost that long. I
just wasn't confident that it worked well enough to post it to
gnu-sources before now).

Still, given both packages do now exist, it does provide for an
interesting (at least for me) comparison of two different approaches
to the same problem. I've had a quick look at pabbrev.el, and here's
my summary of what I think the differences are. I've tried to make
this as unbiased as possible, but obviously I know predictive mode a
lot better, so let me know if I've got something wrong:


1) "Philosophically", predictive mode treats its dictionaries more like
   static reference sources (although they are obviously updated as it
   learns), whereas pabbrev mode treats them more as a dynamic analysis
   of buffer contents. I think most of the differences between the two
   packages stem from this slightly different way of thinking about the
   dictionaries.


User-visible differences:

2) Predictive dictionaries are usually created from a complete list of
   words in the language, e.g. it ships with word lists for English,
   LaTeX, HTML etc. Pabbrev dictionaries are created dynamically from
   the words used in buffers. Although it would just about be possible to
   create predictive dictionaries dynamically, it is definitely not as
   easy as just using pabbrev mode.
   
   The disadvantage is that dictionaries have to be created as a separate
   step before using them. The advantage is that all words in the
   language are always available for completion, they don't need to
   already have been used in a buffer.


3) The pabbrev dictionaries aren't persistent between emacs
   sessions. Predictive dictionaries are (though it is possible to
   make non-persistent dictionaries). The advantage of persistent
   dictionaries is obvious: word frequency information keeps
   accumulating, so the dictionaries increasingly adapt to your
   writing style.
   
   Of course, it wouldn't be that difficult to make pabbrev
   dictionaries persistent, it just hasn't been done yet.


4) Predictive mode can automatically switch between different
   dictionaries in different regions of a buffer. For example, it
   automatically uses a dictionary of maths commands within a LaTeX
   equation environment, or of HTML tags after "<".
   
   Along the same lines, I plan to interface predictive mode with the
   semantic package so that it can use information from its
   lexer/parser to suggest even more intelligent completions in
   programming modes (making it into an enhanced version of the
   "Intellisense" feature found in some IDEs).


5) There are differences in the user interface for predictive and
   pabbrev mode. Some that are down to nothing more than default key
   bindings (e.g. punctuation characters accept completions in
   predictive mode). Others are differences in the features provided:
   e.g. predictive mode can display completions in a tooltip or
   menu, the most likely completions can be selected with single character
   hotkeys.


Low-level differences:

6) The data structures chosen for the dictionaries have different
   trade-offs. Predictive dictionaries have O(log n) lookup for
   completions, with an automatically updated O(1) cache of
   results that took a long time to find. (The data structure for the
   cache is in fact very similar to pabbrev's dictionary
   structure). Pabbrev's dictionaries are O(1) lookup. In practise,
   both are fast enough to type without completions causing delays.
   
   I *think* that the predictive dictionaries have better space
   (memory) scaling than the pabbrev dictionaries, with the trade-off
   of the slower lookup (O(log n) instead of O(1)) described above. (I
   need to look more carefully to work out exactly what the scalings
   are.) I suspect that a reasonably complete dictionary of the most
   common English words (say 40,000) would take up a lot of memory
   using pabbrev's structures. Again, this reflects the difference in
   philosophy described above.


7) Predictive doesn't keep the words sorted by frequency. It sorts
   them on the fly when they're looked up. This adds even more
   overhead to lookup than pabbrev's method (though it does make
   inserting words faster).

   It's amusing that both packages seem to have chosen the
   "wrong" method, given their "philosophy". With it's more static
   dictionaries, why does predictive sort on the fly instead of
   storing that information? With it's dynamically generated
   dictionaries, why does pabbrev have to keep them sorted in the data
   structure? Of course, the answer is that lookup in predictive is
   fast enough even when sorting on the fly, and word insertion in
   pabbrev runs as an idle process so it too is already fast enough.


I've probably missed some things, since I haven't played with pabbrev
mode much yet. I don't know how much the packages could benefit from
each other. I dislike duplication, and would have preferred to
contribute to an existing project if I'd known it existed. But trying
to combine them into one package now doesn't look too likely.

Maybe it's nice to have two different approaches to learning and
storing word frequency information. No doubt each will have advantages
in different circumstances. Predictive mode is more "heavy weight",
and a "lighter-weight" package like pabbrev is probably better for the
majority of users.

The predictive user interface code could be useful for pabbrev
though. For example, if you call the `complete' function from
`predictive-completion.el', supplying it with a list of available
completions, it does all the work needed to provisionally insert the
completion in the buffer, to allow accepting, rejecting, cycling,
tab-completing and hotkey-selecting the provisional completion, and to
display completions in the echo area, in a tooltip, in a completion
menu or in a hierarchical completion browser. (Each feature can be
enabled or disabled via customizations.) If you wanted pabbrev mode to
provide similar features, it would make sense to reuse
`predictive-completion.el' so that development on it could benefit
both packages, and the same user customizations would apply to both
packages.

Let me know what you think, and feel free to continue privately if
this is getting too off-topic.

Toby
-- 
PhD Student
Quantum Information Theory group
Max Planck Institute for Quantum Optics
Garching, Germany

email: [EMAIL PROTECTED]
web: www.dr-qubit.org


_______________________________________________
Gnu-emacs-sources mailing list
[email protected]
http://lists.gnu.org/mailman/listinfo/gnu-emacs-sources

Re: predictive.el -- predictive completion of words as you type in Emacs

Reply via email to